Fabric Plane Management

 

Fabric Plane Management on AS MLC Modular Carrier Card

The Application Services Modular Line Card (AS MLC) provides high application throughput and storage space, and is designed to run services on the MX240, MX480, and MX960 routers. The AS MLC consists of the following components:

  • Application Services Modular Carrier Card (AS MCC)

  • Application Services Modular Processing Card (AS MXC)

  • Application Services Modular Storage Card (AS MSC)

The AS MCC plugs into the chassis and provides the fabric interface.

An MX960 router can support three Switch Control Boards (SCBs) or six fabric planes. The AS MCC supports six fabric planes. An MX240 or MX480 router can support upto two SCBs or two fabric planes. The AS MCC at any time can provide connectivity to only six of the eight fabric planes. Fabric planes 1 and 5, and 3 and 7 use shared physical links. So between fabric planes 1 and 5 only one plane can be active. Similarly between fabric planes 3 and 7, only one plane can be active.

This behavior impacts the output of fabric-related monitoring commands on MX240 and MX480 routers with AS MCCs.

The show chassis fpc pic-status command displays the output for an MX480 router with an AS MCC:

user@host>show chassis fpc pic-status

In the show chassis fpc pic-status command output, Slot 1 and 5 are AS MCC, PIC 0 is the AS MSC, and PIC 2 is the AS MXC.

The show chassis fabric fpcs command displays the output on an MX480 router with an AS MCC.

user@hostshow chassis fabric fpcs

In the show chassis fabric fpcs command output, FPC 5 is the AS MCC.

The show chassis fabric plane command displays the output on an MX480 router with an AS MCC.

user@host>show chassis fabric plane

In the show chassis fabric plane output, FPC 5 is the AS MCC.

The term Unused in the output for the show chassis fabric fpcs and show chassis fabric plane command indicates that one fabric plane from each pair that share physical links (1 and 5, and 3 and 7) is inactive.

See Junos OS System Basics and Services Command Reference for more information.

Fabric Plane Management on MPC4E

MPC4E is a fixed-configuration MPC that provides scalability in bandwidth and services capability of routers. MPC4E is supported on MX240, MX480, MX960, MX2010 and MX2020 routers. The MPC4E plugs into the chassis and provides the fabric interface.

By default, MX240 and MX480 routers with MPC4E support four active fabric planes each. However, this default fabric redundancy mode, also known as redundant fabric mode, makes the MPC run in reduced bandwidth state. In increased bandwidth mode, the MX240 and MX480 routers with MPC4E support six active fabric planes each. You can increase the number of active fabric planes by changing the mode from redundant fabric mode to increased bandwidth mode. To configure the MPC4E to function in increased bandwidth mode, use the existing redundancy-mode increased-bandwidth statement at the [edit chassis fabric] hierarchy level.

If you do not configure the fabric redundancy mode, MPC4E functions in redundant fabric mode. To configure the redundant fabric mode, use the existing redundancy-mode redundant statement at the [edit chassis fabric] hierarchy level.

An MX960 router can support three Enhanced MX Switch Control Boards (SCBEs) or six fabric planes. MX240 and MX480 routers can support up to two SCBEs or four fabric planes each. MX2020 routers can support eight Switch Fabric Boards (SFBs) or 24 fabric planes.

At any given time, on MX240 and MX480 routers, MPC4E can provide connectivity to only six of the eight fabric planes. Fabric planes 1 and 5 and fabric planes 3 and 7 use shared physical links. So, among fabric planes 1 and 5, only one plane can be active. Similarly, among fabric planes 3 and 7, only one plane can be active.

On MX240 and MX480 routers with MPC4E, if the fabric redundancy mode is not configured, then fabric planes 0, 1, 2, and 3 are online and active and fabric planes 4, 5, 6, and 7 are spare. If you configure the increased bandwidth mode, then the fabric planes 0, 1, 2, 3, 4, and 6 are active and fabric planes 5 and 7 are spare.

On MX960 routers with MPC4E, if you configure increased bandwidth mode, then fabric planes 0, 1, 2, 3, 4, and 5 are online. When MPC4E is plugged into an MX960 router, it does not have any fabric redundancy.

MX2020 routers with MPC4E do not support the existing redundancy-mode statement. Of the 24 fabric planes, all 24 planes are active.

Fabric Plane Management on MPC7E

The two variants of MPC7E—MPC7E-MRATE and MPC7E 10G—provide scalability in bandwidth and services capability of routers. The two MPCs are supported on MX240, MX480, and MX960 routers. The MPCs plug into the chassis and provide the fabric interface.

Note

The MPC7E-MRATE and MPC7E-10G MPCs are supported only on MX-SCBE2.

An MX960 router can support three Enhanced Switch Control Boards (SCBE2s)—two planes on each SCB and make up a total of six fabric planes. MX240 and MX480 routers can support up to two SCBE2—four fabric planes on each SCBE2 make up a total of eight planes. However, the MX240 and MX480 routers have only six active planes. The remaining two are redundant.

By default, MX240, MX480, and MX960 routers support four active fabric planes each. However, this default fabric redundancy mode, also known as redundant fabric mode, makes the MPC run in reduced bandwidth state. In increased bandwidth mode, the MX240, MX480, and MX960 routers support six active fabric planes each. You can increase the number of active fabric planes by changing the mode from redundant fabric mode to increased bandwidth mode mode. To configure the MPC7E to function in increased bandwidth mode, use the existing redundancy-mode increased-bandwidth statement at the [edit chassis fabric] hierarchy level. An MPC working with reduced fabric bandwidth can affect the routing process, resulting in reduced throughput. You can enable increased fabric bandwidth of the active SCBE2 for optimal and efficient performance and traffic handling.

On MX240 and MX480 routers, if the fabric redundancy mode is not configured, then fabric planes 0, 1, 2, and 3 are online and active and fabric planes 4, 5, 6, and 7 are redundant. If you configure the increased bandwidth mode, then the fabric planes 0, 1, 2, 3, 4, and 6 are active and fabric planes 5 and 7 are redundant.

On MX960 routers with MPC7E, if you configure increased bandwidth mode, then fabric planes 0, 1, 2, 3, 4, and 5 are active.

The following sections describe the fabric management features supported on the MPC7E MPCs in MX240, MX480, and MX960 routers.

Fabric Hardening

Fabric hardening is the process of controlling bandwidth degradation to prevent traffic black hole. Fabric hardening can be configured with two CLI configuration statements, per fpc bandwidth-degradation and per fpc blackhole-action. The two statements give you more control over what threshold of bandwidth degradation to react to, and what corrective action to take. The per fpc bandwidth-degradation statement determines how the MPC reacts when it reaches a specified bandwidth degradation percentage. The per fpc blackhole-action statement determines how the MPC responds to a 100 percent fabric degradation scenario. This statement is optional and overrides the default fabric hardening procedures.

Limiting Traffic Disruption by Detecting Packet Forwarding Engine Destinations That Are Unreachable over the Fabric

The router is able to detect unreachable destination Packet Forwarding Engines and limit the time for which traffic is disrupted. The router signals neighboring routers when it cannot carry traffic because of the inability of some or all source Packet Forwarding Engines to forward traffic to some or all destination Packet Forwarding Engines on any fabric plane, after interfaces have been created. This inability to forward traffic results in a traffic disruption by the router. When the router detects unreachable Packet Forwarding Engine destinations, it attempts to recover from the condition causing the disruption. If the recovery attempt fails, the system turns off the interfaces, thereby ending the disruption and initiating the recovery process.

The recovery process consists of the following steps:

  1. Fabric plane restart phase: The MPC restarts the fabric planes one by one.

  2. Fabric plane and MPC restart phase: The router restarts both the fabric planes and the MPCs. If there are unreachable MPCs that are unable to initiate high-speed links to the fabric after reboot, traffic disruption is limited because no interfaces are created for these MPCs.

  3. MPC offline phase: When previous attempts at recovery fail, the router makes the MPCs that contribute to the traffic black-hole condition offline and turns off the interfaces.

Fabric Plane Management on JNP10K-LC2101

JNP10K-LC2101 is a fixed-configuration MPC that provides increased port density and performance to MX10008 routers. JNP10K-LC2101 plugs into the chassis and provides the fabric interface.

An MX10008 router has six Switch Fabric Boards (SFBs). JNP10K-LC2101 has six Packet Forwarding Engines, each having 24 connections to the fabric (24 planes, or 4 connections per SFB). MX10008 routers with JNP10K-LC2101 will have 24 planes active when all the six SFBs are populated. However, in case of a failure of one SFB, the line rate can be achieved with 20 planes. The fabric supports a link speed of 25 Gbps.

The MX10008 SFB also supports fabric hardening. Fabric hardening is the process of controlling bandwidth degradation to prevent traffic black hole. The following key CLI commands are available for fabric hardening:

  • set chassis fpc slot-number fabric bandwidth-degradation percentage—Configures the FPC to take a specific action once bandwidth degradation reaches a certain percentage to avoid causing a traffic black hole in the chassis.

  • set chassis fabric degraded detection-enable—Enables detection of an FPC with degraded fabric.

  • set chassis fabric degraded action-fpc-restart-disable—Disables line card restarts to limit recovery actions from a degraded fabric condition.

In MX10008 SFBs, fabric fault handling is supported per plane. Fabric fault handling per plane results in increased granularity, which helps identify, isolate, and repair faults. If an SFB has a single faulty plane, the other three planes can continue to operate. There is no need to take the entire SFB offline. For example, if a plane encounters a training failure error, the line card isolates that faulty plane; while the other planes continue to operate. Also, any cyclic redundancy check (CRC) errors on any link on the SFB are indicated on the plane, not on SFB.

Example: Configuring Fabric Redundancy Mode on MPC4E

Requirements for Configuration of the Fabric Redundancy Mode on MPC4E

This example uses the following hardware and software components:

  • Junos OS Release 12.3 R2 or later for MX Series routers

  • A single MX480 router with MPC4E

Overview

This example provides information about configuring the fabric redundancy mode on an MX480 router with MPC4E. You can configure the MPC4E to function in redundant fabric mode or increased bandwidth mode. If you do not configure the mode, the MPC4E, by default, functions in redundant fabric mode. In redundant fabric mode, the number of active fabric planes is 4. If you configure the MPC4E to function in increased bandwidth mode, the number of active fabric planes increases to 6.

Configuring Increased Bandwidth Mode

Step-by-Step Procedure

In this example, you configure increased bandwidth mode on an MX480 router with MPC4E. The existing fabric mode on the MX480 router is redundant fabric mode. To configure the fabric mode, perform the following tasks:

  1. Verify the existing fabric mode of the router by using the show chassis fabric mode command.
    user@host > show chassis fabric mode
  2. View the number of active fabric planes by using the show chassis fabric summary command.
    user@host > show chassis fabric summary
  3. In configuration mode, go to the [edit chassis] hierarchy level and set the fabric mode to increased-bandwidth as follows:

Results

In redundant fabric mode, the number of active fabric planes is 4 while the number of spare planes is also 4. In increased-bandwidth mode, the number of active planes is 6 while the number of spare planes is 2.

Note

Fabric planes 1 and 5 and fabric planes 3 and 7 use shared physical links. So, among fabric planes 1 and 5, only one plane can be active. Similarly, among fabric planes 3 and 7, only one plane can be active.

Verification

To verify that the fabric mode of the MX480 router with MPC4E, perform the following tasks:

Verifying the Fabric Redundancy Mode of the Router

Purpose

To verify that the fabric redundancy mode of the MX480 router with MPC4E has been modified to increased-bandwidth.

Action

To view the fabric mode of the router, use the show chassis fabric mode command.

user@host > show chassis fabric mode

Meaning

The MX480 router with MPC4E is functioning in increased bandwidth mode.

Verifying the Number of Active Fabric Planes

Purpose

To verify that the number of active fabric planes is 6.

Action

To view the number of active fabric planes, use the show chassis fabric summary command.

user@host > show chassis fabric summary

Meaning

Number of active planes on the MX480 router with MPC4E is 6 (0, 1, 2, 3, 4, and 6) while the number of spare planes is 2.

Configuring Fabric Redundancy Mode for Active Control Boards on MX Series Routers

You can configure the active control board to be in redundancy mode or in increased fabric bandwidth mode. You can enable increased fabric bandwidth of active control boards for optimal and efficient performance and traffic handling by configuring the active control boards to be in redundancy mode. To configure redundancy mode for the active control board, use the redundancy-mode redundant statement at the [edit chassis fabric] hierarchy level:

When you configure this option, all the FPCs use 4 fabric planes as active planes, regardless of the type of the FPC.

To configure increased bandwidth mode for the active control board, use the redundancy-mode increased-bandwidth statement at the [edit chassis fabric] hierarchy level:

In increased fabric bandwidth mode, MX Series routers will use 6 active planes. MX240 and MX480 routers will also use 2 spare planes in addition to the 6 active planes.

Increased fabric bandwidth mode is enabled by default on MX routers with Switch Control Board (SCB). On MX routers with Enhanced SCB—SCBE, regardless of the type of MPC or DPC installed on it, redundancy mode is enabled by default.

Configuring this feature does not affect the system. You can configure this feature without restarting the FPC or restarting the system.

Signaling Neighboring Routers of Fabric Down on T640 and T1600 Routers

In JUNOS OS Release 10.4 and later, T640 and T1600 routers signal neighboring routers if they are unable to carry traffic due to all fabric planes being taken offline for one of the following reasons:

  • CLI or button press initiated offline state.

  • Automatically taken offline by the SPMB due to high temperature.

  • PIO errors or voltage errors detected by the SPMB CPU to the SIBs.

The following scenarios are not supported:

  • All PFEs get destination errors on all planes to all destinations, even with the Switch Interface Boards (SIBs) staying online.

  • Complete fabric loss caused by destination timeouts, with the SIBs still online.

When chassisd detects all fabric planes are down, the router reboots all the FPCs in the system. When the FPCs come back up, the interfaces will not be created again, since all the fabric planes are down.

Once the user diagnoses and fixes the cause of all fabric planes going down, the user must then online the SIBs. The SIB online process brings up the interfaces.

Fabric down signaling to neighboring routers offers the following benefits:

  • FPCs reboot when the control plane connection to the RE times out.

  • Extends a simple approach to reboot FPCs when the dataplane blacks out.

When the router transitions from a state where SIBs are online or spare to a state where there are no SIBs in online state, then all the FPCs in the system are rebooted.

An ERRMSG indicates if all fabric planes are down and the FPCs will be rebooted if any fabric planes do not come up in 2 minutes.

An ERRMSG indicates the reason for FPC reboot on fabric connectivity loss.

The chassisd daemon traces when an FPC comes online, but PIC attach is not done due to no fabric plane present.

A warning is issued in the CLI when the last fabric plane is taken offline, that FPCs will reboot. You will need to online the SIBs after fixing the cause of the SIBs not being online. When the first SIB goes online, and link training with the FPCs completes, the interfaces will be created.

Fabric down signaling to neighboring routers functionality is available by default, and no user configuration required to enable it.

No CLI commands or alarms are required for this feature. Alarms indicate an SIBs offline system state to the user.