Understanding CoS Port Schedulers

Port scheduling defines the class-of-service (CoS) properties of output queues. You configure CoS properties in a scheduler, then map the scheduler to a forwarding class. Forwarding classes are in turn mapped to output queues. Classifiers map incoming traffic into forwarding classes based on IEEE 802.1p, DSCP, or EXP code points.

Output queue properties include the amount of interface bandwidth assigned to the queue, the size of the memory buffer allocated for storing packets, the scheduling priority of the queue, and the weighted random early detection (WRED) drop profiles associated with the queue to control packet drop during periods of congestion.

Scheduler maps map schedulers to forwarding classes. The output queue mapped to a forwarding class receives the port resources and properties defined in the scheduler mapped to that forwarding class. You apply a scheduler map to an interface to apply queue scheduling to a port. You can associate different scheduler maps with different interfaces to configure port-specific scheduling for forwarding classes (output queues).

Note:

Port scheduling is simpler to configure than enhanced transmission selection (ETS) two-tier hierarchical port scheduling. Port scheduling allocates port bandwidth to output queues directly, instead of allocating port bandwidth to output queues through a scheduling hierarchy. While port scheduling is simpler, ETS is more flexible.

ETS allocates port bandwidth in a two-tier hierarchy:

Port bandwidth is first allocated to a priority group using the CoS properties defined in a traffic control profile. A priority group is a group of forwarding classes (which are mapped to output queues) that require similar CoS treatment.
Priority group bandwidth is allocated to the output queues (which are mapped to forwarding classes) using the properties defined in the output queue scheduler.

Note:

When you configure bandwidth for a queue, the switch considers only the data as the configured bandwidth. The switch does not account for the bandwidth consumed by the preamble and the interframe gap (IFG). Therefore, when you calculate and configure the bandwidth requirements for a queue, consider the preamble and the IFG as well as the data in the calculations.

Queue Scheduling Components

Table 1 provides a quick reference to the scheduler components you can configure to determine the bandwidth properties of output queues (forwarding classes).

Table 1: Output Queue Scheduler Components
Output Queue Scheduler Component	Description
Buffer size	Sets the size of the queue buffer.
Drop profile map	Maps a drop profile to a packet loss priority. Drop profile map components include: Drop profile—Sets the probability of dropping packets as the queue fills up. Loss priority—Sets the traffic packet loss priority to which a drop profile applies.
Excess rate	Sets the percentage of extra bandwidth (bandwidth that is not used by other queues) a queue can receive. If not set, the switch uses the transmit rate to determine how much extra bandwidth the queue can use. Extra bandwidth is the bandwidth remaining after all guaranteed bandwidth requirements are met.
Explicit congestion notification	Enables explicit congestion notification (ECN) on the queue.
Priority	Sets the scheduling priority applied to the queue.
Transmit rate	Sets the minimum guaranteed bandwidth on low and high priority queues. By default, if you do not configure an excess rate, extra bandwidth is shared among queues in proportion to the transmit rate of each queue. On strict-high priority queues, sets the amount of bandwidth that receives strict-high priority forwarding treatment. Traffic that exceeds the transmit rate shares in the port excess bandwidth pool based on the strict-high priority excess bandwidth sharing weight of “1”, which is not configurable. The actual amount of extra bandwidth that traffic exceeding the transmit rate receives depends on how many other queues consume excess bandwidth and the excess rates of those queues. If you configure two or more strict-high priority queues on a port, you must configure a transmit rate on those queues. However, we strongly recommend that you always configure a transmit rate on strict-high priority queues to prevent them from starving other queues.

Table 2 provides a quick reference to some related scheduling configuration components.

Table 2: Related Scheduling Components
Related Scheduling Components	Description
Forwarding class	Maps traffic classified into the forwarding class at the switch ingress to an output queue. Classifiers map forwarding classes to IEEE 802.1p, DSCP, or EXP code points. A forwarding class, an output queue, and code point bits are mapped to each other and identify the same traffic. (The code point bits identify incoming traffic. Classifiers assign traffic to forwarding classes based on the code point bits. Forwarding classes map to output queues. This mapping determines the output queue each class of traffic uses on the switch egress interfaces.)
Output queue (virtual output queue)	Output queues are virtual, and are comprised of the physical buffers on the ingress pipeline of each Packet Forwarding Engine (PFE) chip to store traffic for every egress port. Every output queue on an egress port has buffer storage space on every ingress pipeline on all of the PFE chips on the switch. The mapping of ingress pipeline storage space to output queues is 1-to-1, so each output queue receives buffer space on each ingress pipeline. See Understanding CoS Virtual Output Queues (VOQs) on QFX10000 Switches for more information.
Scheduler map	Maps schedulers to forwarding classes (forwarding classes are mapped to queues, so a forwarding class represents a queue, and the scheduler mapped to a forwarding class determines the CoS properties of the output queue mapped to that forwarding class).

Default Schedulers

If you do not configure CoS, the switch uses its default settings. Each forwarding class requires a scheduler to set the CoS properties of the forwarding class and its output queue. The default configuration has four forwarding classes: best-effort (queue 0), fcoe (queue 3), no-loss (queue 4), and network-control (queue 7). Each default forwarding class is mapped to a default scheduler. You can use the default schedulers or you can define new schedulers for these four forwarding classes. For explicitly configured forwarding classes, you must explicitly configure a queue scheduler to allocate CoS resources to the traffic mapped to each forwarding class.

Table 3 shows the default queue schedulers.

Table 3: Default Scheduler Configuration
Default Scheduler and Queue Number	Transmit Rate (Guaranteed Minimum Bandwidth)	Rate Shaping (Maximum Bandwidth)	Excess Bandwidth Sharing	Priority	Buffer Size
best-effort forwarding class scheduler (queue 0)	15%	None	15%	low	15%
fcoe forwarding class scheduler (queue 3)	35%	None	35%	low	35%
no-loss forwarding class scheduler (queue 4)	35%	None	35%	low	35%
network-control forwarding class scheduler (queue 7)	15%	None	15%	low	15%

Note:

By default, the minimum guaranteed bandwidth (transmit rate) determines the amount of excess (extra) bandwidth a queue can share. Extra bandwidth is allocated to queues in proportion to the transmit rate of each queue. You can configure bandwidth sharing (excess rate) to override the default setting and configure the excess bandwidth percentage independently of the transmit rate.

By default, only the four default schedulers shown in Table 3 have traffic mapped to them. Only the forwarding classes and queues associated with the default schedulers receive default bandwidth, based on the default scheduler transmit rate. (You can configure schedulers and forwarding classes to allocate bandwidth to other queues or to change the default bandwidth of a default queue.) If a forwarding class does not transport traffic, the bandwidth allocated to that forwarding class is available to other forwarding classes. Unicast and multidestination (multicast, broadcast, and destination lookup fail) traffic use the same forwarding classes and output queues.

Default scheduling is port scheduling. If you configure scheduling instead of using default scheduling, you can configure port scheduling or enhanced transmission selection (ETS) hierarchical port scheduling.

Default scheduling uses weighted round-robin (WRR) scheduling. Each queue receives a portion (weight) of the total available port bandwidth. The scheduling weight is based on the transmit rate (minimum guaranteed bandwidth) of the default scheduler for that queue. For example, queue 7 receives a default scheduling weight of 15 percent of available port bandwidth, and queue 4 receives a default scheduling weight of 35 percent of available bandwidth. Queues are mapped to forwarding classes (for example, queue 7 is mapped to the network-control forwarding class and queue 4 is mapped to the no-loss forwarding class), so forwarding classes receive the default bandwidth for the queues to which they are mapped. Unused bandwidth is shared with other default queues.

You should explicitly map traffic to non-default (unconfigured) queues and schedule bandwidth resources for those queues if you want to use them to forward traffic. By default, queues 1, 2, 5, and 6 are unconfigured. Unconfigured queues have a default scheduling weight of 1 so that they can receive a small amount of bandwidth in case they need to forward traffic.

If you map traffic to an unconfigured queue and do not schedule bandwidth for the queue, the queue receives only the amount of bandwidth proportional to its default weight (1). The actual amount of bandwidth an unconfigured queue receives depends on how much bandwidth the other queues on the port are using.

If the other queues use less than their allocated amount of bandwidth, the unconfigured queues can share the unused bandwidth. Because of their scheduling weights, configured queues have higher priority for bandwidth than unconfigured queues. If a configured queue needs more bandwidth, then less bandwidth is available for unconfigured queues. However, unconfigured queues always receive a minimum amount of bandwidth based on their scheduling weight (1). If you map traffic to an unconfigured queue, to allocate bandwidth to that queue, configure a scheduler and map it to the forwarding class that is mapped to the queue, and then apply the scheduler map to the port.

Scheduling Priority

Scheduling priority determines the order in which an interface transmits traffic from its output queues. Priority settings ensure that queues containing important traffic receive prioritized access to the outgoing interface bandwidth. The priority setting in the scheduler determines queue priority (a scheduler map maps the scheduler to a forwarding class, the forwarding class is mapped to an output queue, and the output queue uses the CoS properties defined in the scheduler).

By default, all queues are low priority queues. The switch supports three levels of scheduling priority:

Low—In the default CoS state, all queues are low priority queues. Low priority queues transmit traffic based on the weighted round-robin (WRR) algorithm. If you configure scheduling priorities higher than low priority on queues, then the higher priority queues are served before the low priority queues.
Medium-low— (QFX10000 Series switches only) Medium-low priority queues transmit traffic based on the weighted round-robin (WRR) algorithm, and have higher scheduling priority than low priority queues.
Medium-high— (QFX10000 Series switches only) Medium-high priority queues transmit traffic based on the weighted round-robin (WRR) algorithm, and have higher scheduling priority than medium-low priority queues.
High— (QFX10000 Series switches only) High priority queues transmit traffic based on the weighted round-robin (WRR) algorithm, and have higher scheduling priority than medium-high priority queues.
Strict-high—You can configure queues as strict-high priority. Strict-high priority queues receive preferential treatment over all other queues, and receive all of their configured bandwidth before other queues are serviced. Other queues do not transmit traffic until strict-high priority queues are empty, and they receive the bandwidth that remains after the strict-high priority queues are serviced. Because strict-high priority queues are always serviced first, strict-high priority queues can starve other queues on a port. Carefully consider how much bandwidth you want to allocate to strict-high priority queues to avoid starving other queues.

Note:

For QFX10002, QFX10008, and QFX10016 devices, strict-high priority queues share excess bandwidth based on an excess bandwidth sharing weight of 1, which is not configurable. The actual amount of extra bandwidth that strict-high priority traffic exceeding the transmit rate receives depends on how many other queues consume excess bandwidth and the excess rates of those queues.

For QFX10002-60C, excess traffic on the strict-high queue will starve other high/low priority queues.

When you define scheduling priorities for queues instead of using the default priorities (by default all queues are low priority), the switch uses the priorities to determine the order of packet transmission from the queues. The switch services traffic of different scheduling priorities in a strict order, using round-robin (RR) scheduling to arbitrate queue transmission service among queues of the same priority. The switch transmits packets is the following order:

Strict-high priority traffic within the configured queue transmit rate (on strict-high priority queues, the transmit rate limits the amount of traffic treated as strict-high priority traffic). When traffic arrives on a strict-high priority queue, the switch forwards it before servicing other queues.
High priority traffic within the configured queue transmit rate (on high priority queues, the transmit rate sets the minimum guaranteed bandwidth)
Medium-high priority traffic within the configured queue transmit rate (on medium-high priority queues, the transmit rate sets the minimum guaranteed bandwidth)
Medium-low priority traffic within the configured queue transmit rate (on medium-low priority queues, the transmit rate sets the minimum guaranteed bandwidth)
Low priority traffic within the configured queue transmit rate (on low priority queues, the transmit rate sets the minimum guaranteed bandwidth)
All traffic that exceeds the queue transmit rate using weighted round-robin (WRR) scheduling. Traffic that exceeds the queue transmit rate contends for excess port bandwidth (bandwidth that is not consumed after the port meets all guaranteed bandwidth requirements). The switch allocates and weights excess bandwidth for low priority queues based on the configured queue excess rate, or on the transmit rate if no excess rate is configured. The switch allocates and weights excess bandwidth for strict-high priority queues based on the hard-coded weight “1”, which is not configurable. The actual amount of extra bandwidth that traffic exceeding the transmit rate gets depends on how many other queues consume excess bandwidth and the weighting of those queues.

Note:

If you use the default CoS configuration, all queues are low priority queues and transmit traffic based on the weighted round-robin (WRR) algorithm.

Bandwidth Scheduling

A queue scheduler allocates port bandwidth to a queue (the scheduler is mapped to a forwarding class, and the forwarding class is mapped to a queue). The bandwidth profile, which consists of minimum guaranteed bandwidth, maximum bandwidth (queue shaping), and excess bandwidth sharing properties configured in the scheduler, defines the amount of port bandwidth a queue can consume during normal and congested transmission periods.

The scheduler regularly reevaluates whether each individual queue is within its defined bandwidth profile by comparing the amount of data the queue receives to the amount of bandwidth the scheduler allocates to the queue. When the received amount is less than the guaranteed minimum amount of bandwidth, the queue is considered to be in profile. A queue is out of profile when its received amount is larger than its guaranteed minimum amount. Out of profile queue data is transmitted only if extra (excess) bandwidth is available. Otherwise, it is buffered if buffer space is available. If no buffer space is available, the traffic might be dropped.

The switch provides features that enable you to control the allocation of port bandwidth to queues, so that you can meet the demands of different types of traffic on a port:

Minimum Guaranteed Bandwidth
Maximum Bandwidth (Rate Shaping on Low and High Priority Queues and LAGs)
Limiting Bandwidth Consumed by Strict-High Priority Queues
Sharing Extra Bandwidth (Excess Rate on Low and High Priority Queues)

Minimum Guaranteed Bandwidth

The transmit rate determines the minimum guaranteed bandwidth for each forwarding class that is mapped to an output queue, and so determines the minimum bandwidth guarantee on that queue.

If you do not want to use the default configuration, you can set the minimum guaranteed bandwidth in several ways, and with several options, using the [set class-of-service schedulers scheduler-name transmit-rate (rate | percent percentage) <exact>]statement:

Rate—Set the minimum guaranteed bandwidth as a fixed amount (rate) in bits-per-second of port bandwidth (for example, 2 Gbps or 800 Mbps).
Percent—Set the minimum guaranteed bandwidth as a percentage of port bandwidth (for example, 25 percent).
Exact—(QFX10000 switches only) Shape the queue to the transmit rate so that the transmit rate is the maximum amount of bandwidth a queue can use. The queue cannot share extra port bandwidth if you configure the exact option. Configuring a transmit rate as exact is how you set a shaping rate to configure the maximum amount of bandwidth low and high priority queues can consume, and the maximum is the transmit rate. You cannot use the exact option on a strict-high priority queue.

Note:
On QFX10000 switches, oversubscribing all 8 queues configured with the transmit rate exact (shaping) statement at the [edit class-of-service schedulers scheduler-name] hierarchy level might result in less than 100 percent utilization of port bandwidth.
Extra bandwidth sharing—On low and high priority queues, if you configure an excess rate, the excess rate determines the amount of extra port bandwidth a queue can use. If you do not configure an excess rate, the transmit rate determines how much excess (extra) bandwidth a low and high priority queue can share. If you do not configure an excess rate, then each queue shares extra bandwidth in proportion to its transmit rate.

You cannot configure an excess rate on strict-high priority queues. Strict-high priority queues share extra bandwidth based on a scheduling weight of “1”, which is not configurable. The actual amount of extra bandwidth that traffic exceeding the transmit rate gets depends on how many other queues consume excess bandwidth and the excess rates of those queues.

Note:

The sum of the transmit rates of the queues on a port should not exceed the total bandwidth of that port. (You cannot guarantee a combined minimum bandwidth for the queues on a port that is greater than the total port bandwidth.)

Note:

For transmit rates below 1 Gbps, we recommend that you configure the transmit rate as a percentage instead of as a fixed rate. This is because the system converts fixed rates into percentages and might round small fixed rates to a lower percentage. For example, a fixed rate of 350 Mbps is rounded down to 3 percent.

The bandwidth a low or high priority queue consumes can exceed the configured minimum rate if additional bandwidth is available, and if you do not configure the transmit rate as exact on QFX10000 switches. During periods of congestion, the configured transmit rate is the guaranteed minimum bandwidth for the queue. This behavior enables you to ensure that each queue receives the amount of bandwidth appropriate to its required level of service and is also able to share unused bandwidth.

Maximum Bandwidth (Rate Shaping on Low and High Priority Queues and LAGs)

On QFX10000 switches, the optional exact keyword in the [set class-of-service schedulers scheduler-name transmit-rate (rate | percent percentage) <exact>] configuration statement shapes the transmission rate of low and high priority queues. When you specify the exact option, the switch drops traffic that exceeds the configured transmit rate, even if excess bandwidth is available. Rate shaping prevents a queue from using more bandwidth than is appropriate for the planned service level of the traffic on the queue. You cannot use the exact option on a strict-high priority queue.

Configuring rate shaping on a LAG interface using the [edit class-of-service interfaces lag-interface-name scheduler-map scheduler-map-name] statement can result in scheduled traffic streams receiving more LAG link bandwidth than expected.

LAG interfaces consist of two or more Ethernet links bundled together to function as a single interface. The switch can hash traffic entering a LAG interface onto any member link in the LAG interface. When you configure a rate shaping and apply it to a LAG interface, the way that the switch applies the rate shaping to traffic depends on how the switch hashes the traffic onto the LAG links.

To illustrate how link hashing affects the way the switch applies rate shaping to LAG traffic, let’s look at a LAG interface named ae0 that has two member links, xe-0/0/20 and xe-0/0/21. On LAG ae0, we configure rate shaping of 2g by including the transmit-rate 2g exact statement in the queue scheduler, and apply the scheduler to traffic assigned to the best-effort forwarding class, which is mapped to output queue 0. When traffic in the best-effort forwarding class reaches the LAG interface, the switch hashes the traffic onto one of the two member links.

If the switch hashes all of the best-effort traffic onto the same LAG link, the traffic receives a maximum of 2g bandwidth on that link. In this case, the intended cumulative limit of 2g for best effort traffic on the LAG is enforced.

However, if the switch hashes the best-effort traffic onto both of the LAG links, the traffic receives a maximum of 2g bandwidth on each LAG link, not 2g as a cumulative total for the entire LAG. The result is that best-effort traffic receives a maximum of 4g on the LAG, not the 2g set by the rate shaping statement. When hashing spreads the traffic assigned to an output queue (which is mapped to a forwarding class) across multiple LAG links, the effective shaping rate (cumulative maximum bandwidth) on the LAG is:

(number of LAG member interfaces) x (shaping rate for the output queue) = cumulative LAG shaping rate

Limiting Bandwidth Consumed by Strict-High Priority Queues

You can limit the amount of traffic that receives strict-high priority treatment on a queue by configuring a transmit rate on the strict-high priority queue. The transmit rate sets the amount of traffic that receives strict-high priority treatment. Traffic that exceeds the transmit rate shares in the port excess bandwidth pool based on the strict-high priority excess bandwidth sharing weight of “1”, which is not configurable. The actual amount of extra bandwidth that traffic exceeding the transmit rate gets depends on how many other queues consume excess bandwidth and the excess rates of those queues. Limiting the amount of traffic that receives strict-high priority treatment prevents other queues from being starved, while also ensuring that the amount of traffic specified in the transmit rate receives strict-high priority treatment.

Note:

Configuring a transmit rate on a low or high priority queue sets the guaranteed minimum bandwidth of the queue, as described in Minimum Guaranteed Bandwidth.

CAUTION:

If you configure strict-high priority queues, we strongly recommend that you configure a transmit rate on the queues to prevent them from starving low and high priority queues on that port. This is especially important if you configure more than one strict-high priority queue on a port. Although it is not mandatory to configure a transmit rate on strict-high priority queues, if you do not configure a transmit rate, the strict-high priority queues can consume all of the port bandwidth and starve the other queues.

Sharing Extra Bandwidth (Excess Rate on Low and High Priority Queues)

Extra bandwidth is essentially the bandwidth remaining after the switch meets all guaranteed bandwidth requirements. Extra bandwidth is available to low and high priority traffic when the queues on a port do not use all of the available port bandwidth.

By default, extra port bandwidth is shared among the forwarding classes on a port in proportion to the transmit rate of each queue. You can explicitly configure the amount of extra bandwidth a queue can share by setting an excess-rate in the scheduler of a low or high priority queue. The configured excess rate overrides the transmit rate and determines the percentage of extra bandwidth the queue can consume.

Note:

You cannot configure an excess rate on a strict-high priority queue. Strict-high priority queues share excess bandwidth based on an excess bandwidth sharing weight of “1”, which is not configurable. The actual amount of extra bandwidth that strict-high priority traffic exceeding the transmit rate receives depends on how many other queues consume excess bandwidth and the excess rates of those queues.

Note:

QFX 10002, QFX 10008, and QFX 10016 support multiple strict-high queues.

QFX 10002-60C supports only one strict-high queue.

An example of extra bandwidth allocation based on transmit rates is a port that has traffic running on three forwarding classes, best-effort, fcoe, and network-control. In this example, the best-effort forwarding class has a transmit rate of 2 Gbps, forwarding class fcoe has a transmit rate of 4 Gbps, and network-control has a transmit rate of 2 Gbps, for a total of 8 Gbps of the port bandwidth. After servicing the minimum guaranteed bandwidth of these three queues, the port has 2 Gbps of available extra bandwidth.

If all three queues still have packets to forward, the queues receive the extra bandwidth in proportion to their transmit rates, so the best-effort queue receives an extra 500 Mbps, the fcoe queue receives an extra 1 Gbps, and the network-control queue receives an extra 500 Mbps.

If you configure an excess rate for a queue, the excess rate determines the proportion of extra bandwidth that the queue receives in the same way that the default (transmit rate) determines the proportion of extra bandwidth a queue receives. In the previous example, if you configured an excess rate of 20 percent on the fcoe forwarding class, and the transmit rates of the best-effort and network-control forwarding classes remained 2g (with no configured excess rate, so the 2g transmit rate for each queue still determines the excess rate), then the 2 Gbps of extra bandwidth would be allocated evenly among the three queues because all three queues have the same excess rate.

In the previous example, if you configured an excess rate of 10 percent on the fcoe forwarding class, and the transmit rates of the best-effort and network-control forwarding classes remained 2g (again with no configured excess rate, so the 2g transmit rate for each queue still determines the excess rate), the 2 Gbps of extra bandwidth would be allocated 800 Mbps to the best-effort queue, 400 Mbps to the fcoe queue, and 800 Mbps to the network-control queue (again, in proportion to the queue excess rates).

Scheduler Drop-Profile Maps

Drop-profile maps associate drop profiles with queue schedulers and packet loss priorities (PLPs). Drop profiles set thresholds for dropping packets during periods of congestion, based on the queue fill level and a percentage probability of dropping packets at the specified queue fill level. At different fill levels, a drop profile sets different probabilities of dropping a packet during periods of congestion.

Classifiers assign incoming traffic to forwarding classes (which are mapped to output queues), and also assign a PLP to the incoming traffic. The PLP can be low, medium-high, or high. You can classify traffic with different PLPs into the same forwarding class to differentiate treatment of traffic within the forwarding class.

In a drop profile map, you can configure a different drop profile for each PLP and associate (map) the drop profiles to a queue scheduler. A scheduler map maps the queue scheduler to a forwarding class (output queue). Traffic classified into the forwarding class uses the drop characteristics defined in the drop profiles that the drop profile map associates with the queue scheduler. The drop profile the traffic uses depends on the PLP that the classifier assigns to the traffic. (You can map different drop profiles to the forwarding class for different PLPs.)

In summary:

Classifiers assign one of three PLPs (low, medium-high, high) to incoming traffic when classifiers assign traffic to a forwarding class.
Drop profiles set thresholds for packet drop at different queue fill levels.
Drop profile maps associate a drop profile with each PLP, and then map the drop profiles to schedulers.
Scheduler maps map schedulers to forwarding classes, and forwarding classes are mapped to output queues. The scheduler mapped to a forwarding class determines the CoS characteristics of the output queue mapped to the forwarding class, including the drop profile mapping.

You associate a scheduler map with an interface to apply the drop profiles and other scheduler elements to traffic in the forwarding class mapped to the scheduler on that interface.

Buffer Size

On QFX10000 switches, the buffer size is the amount of time in milliseconds of port bandwidth that a queue can use to continue to transmit packets during periods of congestion, before the buffer runs out and packets begin to drop.

The switch can use up to 100 ms total (combined) buffer space for all queues on a port. A buffer-size configured as one percent is equal to 1 ms of buffer usage. A buffer-size of 15 percent (the default value for the best effort and network control queues) is equal to 15 ms of buffer usage.

The total buffer size of the switch is 4 GB. A 40-Gigabit port can use up to 500 MB of buffer space, which is equivalent to 100 ms of port bandwidth on a 40-Gigabit port. A 10-Gigabit port can use up to 125 MB of buffer space, which is equivalent to 100 ms of port bandwidth on a 10-Gigabit port. The total buffer sizes of the eight output queues on a port cannot exceed 100 percent, which is equal to the full 100 ms total buffer available to a port. The maximum amount of buffer space any queue can use is also 100 ms (which equates to a 100 percent buffer-size configuration), but if one queue uses all of the buffer, then no other queue receives buffer space.

There is no minimum buffer allocation, so you can set the buffer-size to zero (0) for a queue. However, we recommend that on queues on which you enable PFC to support lossless transport, you allocate a minimum of 5 ms (a minimum buffer-size of 5 percent). The two default lossless queues, fcoe and no-loss, have default buffer-size values of 35 ms (35 percent).

Note:

If you do not configure buffer-size and you do not explicitly configure a queue scheduler, the default buffer-size is the default transmit rate of the queue. If you explicitly configure a queue scheduler, the default buffer allocations are not used. If you explicitly configure a queue scheduler, configure the buffer-size for each queue in the scheduler, keeping in mind that the total buffer-size of the queues cannot exceed 100 percent (100 ms).

If you do not use the default configuration, you can explicitly configure the queue buffer size in either of two ways:

As a percentage—The queue receives the specified percentage of dedicated port buffers when the queue is mapped to the scheduler and the scheduler is mapped to a port.
As a remainder—After the port services the queues that have an explicit percentage buffer size configuration, the remaining port dedicated buffer space is divided equally among the other queues to which a scheduler is attached. (No default or explicit scheduler means no dedicated buffer allocation for the queue.) If you configure a scheduler and you do not specify a buffer size as a percentage, remainder is the default setting.

Queue buffer allocation is dynamic, shared among ports as needed. However, a queue cannot use more than its configured amount of buffer space. For example, if you are using the default CoS configuration, the best-effort queue receives a maximum of 15 ms of buffer space because the default transmit rate for the best-effort queue is 15 percent.

If a switch experiences congestion, queues continue to receives their full buffer allocation until 90 percent of the 4 GB buffer space is consumed. When 90 percent of the buffer space is in use, the amount of buffer space per port, per queue, is reduced in proportion to the configured buffer size for each queue. As the percentage of consumed buffer space rises above 90 percent, the amount of buffer space per port, per queue, continues to be reduced.

On 40-Gigabit ports, because the total buffer is 4 GB and the maximum buffer a port can use is 500 MB, up to seven 40-Gigabit ports can consume their full 100 ms allocation of buffer space. However, if an eighth 40-Gigabit port requires the full 500 MB of buffer space, then the buffer allocations are proportionally reduced because the buffer consumption is above 90 percent.

On 10-Gigabit ports, because the total buffer is 4 GB and the maximum buffer a port can use is 125 MB, up to 28 10-Gigabit ports can consume their full 100 ms allocation of buffer space. However, if a 29th 10-Gigabit port requires the full 125 MB of buffer space, then the buffer allocations are proportionally reduced because the buffer consumption is above 90 percent.

Explicit Congestion Notification

ECN enables end-to-end congestion notification between two endpoints on TCP/IP based networks. The two endpoints are an ECN-enabled sender and an ECN-enabled receiver. ECN must be enabled on both endpoints and on all of the intermediate devices between the endpoints for ECN to work properly. Any device in the transmission path that does not support ECN breaks the end-to-end ECN functionality. ECN notifies networks about congestion with the goal of reducing packet loss and delay by making the sending device decrease the transmission rate until the congestion clears, without dropping packets.

ECN is disabled by default. Normally, you enable ECN only on queues that handle best-effort traffic because other traffic types use different methods of congestion notification—lossless traffic uses priority-based flow control (PFC) and strict-high priority traffic receives all of the port bandwidth it requires up to the point of a configured rate (see Scheduling Priority).

Scheduler Maps

A scheduler map maps a forwarding class to a queue scheduler. After configuring a scheduler, you must include it in a scheduler map, and apply the scheduler map to an interface to implement the configured queue scheduling.

ON THIS PAGE

Understanding CoS Port Schedulers

Queue Scheduling Components

Default Schedulers

Scheduling Priority

Bandwidth Scheduling

Minimum Guaranteed Bandwidth

Maximum Bandwidth (Rate Shaping on Low and High Priority Queues and LAGs)

Limiting Bandwidth Consumed by Strict-High Priority Queues

Sharing Extra Bandwidth (Excess Rate on Low and High Priority Queues)

Scheduler Drop-Profile Maps

Buffer Size

Explicit Congestion Notification

Scheduler Maps

Related Documentation