BFD Behavior During Network Congestion
On the SSR, BFD packets are ordinary IP/UDP packets that are queued and forwarded like any other traffic. When an SSR interface or an intermediate device is congested, BFD packets can be delayed or dropped. If too many BFD packets are missed, BFD declares the peer path or routing adjacency down, which can trigger session failover or route withdrawal.
By design, SSR does not mark BFD traffic with an extremely high‑priority DSCP value. Instead, BFD is intended to experience similar queuing and loss as normal traffic, so that measurements of latency, jitter, and loss are representative of what user sessions see on that path. You can tune BFD timers and damping to change how quickly BFD reacts to congestion, and you can use DSCP marking and DSCP steering to influence how BFD traffic is queued and routed through the network.
In short:
- Under congestion, BFD may flap or declare paths down if its own packets are dropped.
- If you raise BFD’s priority in the network, it will be more stable but less representative of application experience.
- If you keep BFD at modest or best‑effort priority, its behavior will closely track what users see, but it may require more tolerant timers and damping to avoid excessive failover.
BFD on the SSR
The SSR uses two BFD implementations:
SSR BFD is used between SSR instances. It uses UDP destination port 1280. It runs in asynchronous control mode for liveness and echo mode for link‑quality measurements (latency, jitter, loss). SSR BFD is configured under the bfd hierarchy at the router, neighborhood, or adjacency level. For configuration details and tuning guidance, see Tuning BFD Settings.
Routing BFD is used by OSPF and BGP, with the standard IETF BFD ports 3784 (single‑hop), 4784 (multi‑hop), and 3785 (echo). Routing BFD is configured under the routing hierarchy on OSPF interfaces and BGP neighbors, and it does not change SSR BFD behavior. For FRR configuration details, see Bidirectional Forwarding Detection (BFD).
From a forwarding perspective, both SSR BFD and Routing BFD packets are classified, queued, and forwarded by the SSR data plane like any other traffic. They are not inherently protected from congestion.
Handling BFD Packets on a Congested Interface
When an SSR device-interface is congested, queues on that interface begin to fill and may drop packets. BFD packets leaving that interface are generated by the BFD agent (SSR or FRR), then handled by the standard forwarding pipeline. They are classified into services and tenants, optionally mapped into traffic‑engineering (TE) classes and queues, and enqueued and scheduled alongside other flows in the same class.
There is no special, out‑of‑band forwarding path for BFD once packets leave the BFD process. Under congestion, BFD packets can therefore be delayed or dropped like any other traffic in their queue.
For SSR BFD:
- Asynchronous BFD packets are transmitted at a configured interval (
desired-tx-interval). - Each peer expects to receive packets no more frequently than its own
required-min-rx-interval. - If a number of successive packets (
multiplier) are not received within the negotiated interval, the peer path is declared down.
For Routing BFD:
- Asynchronous BFD packets are similarly sent and expected at configured intervals.
- If the configured
multiplierof packets is missed, the session is declared down, and the associated routing adjacency (OSPF or BGP) goes down.
In both cases, persistent congestion that causes repeated BFD drops can bring down paths and adjacencies, even if underlying physical links remain up.
BFD Timers and Congestion Sensitivity
The following BFD parameters determine how sensitive BFD is to delayed or dropped packets on a congested interface. These parameters are configured under the bfd hierarchy associated with a router, neighborhood, or adjacency. The main parameters are:
desired-tx-interval: Defines the interval, in milliseconds, at which the router transmits asynchronous BFD control packets.required-min-rx-interval: Defines the minimum interval, in milliseconds, at which the router receives BFD control packets.multiplier: Defines the number of consecutive asynchronous BFD packets that may be missed before declaring the peer path down.link-test-interval: Defines the interval, in seconds, between BFD echo mode tests.link-test-length: Defines the number of echo packets sent in each echo mode test.required-min-echo-interval: Defines the minimum interval between echo packets that the router is willing to accept.
BFD Negotiated Intervals
BFD intervals and multipliers are negotiated between peers. When defining these values, it can be helpful to know the Tx and Rx Timer's current negotiated values as well as the current multiplier. Use the show peers bfd-interval to display these values.
Multiplier – Peer’s configured multiplier. It is the number of missed async packets until the local router deems its peer down.
Tx Timer – Configured value under desired-tx-interval, no negotiation involved. One async packet is sent at the end of each Tx Timer
Rx Timer – Local router expects to receive an async packet from the peer before the end of each timer. Set based on max value of local required-min-rx-interval and peer’s desired-tx-interval. Updated after first received BFD packet.
show peers bfd-interval
Retrieving peer paths...
========= ======== =================== ============= ======== ========== ========== ============
Peer Node Network Interface Destination Status Rx Timer Tx Timer Multiplier
========= ======== =================== ============= ======== ========== ========== ============
Berkley slice1 intf1 192.168.1.1 up 1.50s 3.00s 5
Berkley slice2 intf2 192.168.2.1 down - - -
Example Configuration
bfd
state enabled
desired-tx-interval 1000 # ms
required-min-rx-interval 500 # ms
required-min-echo-interval 1000 # ms
authentication-type sha256
multiplier 3
link-test-interval 10 # seconds
link-test-length 10 # echo packets per test
exit
These parameters are negotiated between two SSR peers. Each router advertises its desired-tx-interval and required-min-rx-interval. Once communication is established, each peer calculates when it should send and expect to receive BFD control packets based on the maximum of the local required-min-rx-interval and the peer’s desired-tx-interval. The multiplier is then applied to this effective interval to derive how long the router will wait for an asynchronous packet before considering the peer path down.
On a congested interface, delayed or dropped BFD packets can cause the time between received packets to exceed this threshold. When that happens, the BFD state for the affected peer path transitions to “down,” and the SSR notifies consumers such as the load‑balancer that the path is no longer usable. If congestion is transient and packets resume before the threshold is reached, the session remains up but echo mode tests will show increased latency or jitter.
You can make BFD more tolerant of congestion by increasing the multiplier, or by using longer transmit and receive intervals. A configuration intended for lossy links might use a longer link-test-interval, a larger link-test-length, and a higher multiplier, so that multiple echo tests and several missing packets are required before a path is declared down. The trade‑off is that BFD will take longer to detect and react to true failures.
Damping and Hold‑Down: Controlling Flapping
BFD damping is designed to prevent BFD state from oscillating in environments where link quality fluctuates rapidly. Without damping, repeated short bursts of congestion could repeatedly drive peer paths up and down, leading to routing flaps and unnecessary session failovers. In some cases a peer path may incorrectly be reported as flapping, where in fact the cause may be due to congestion. In these situations, proper tuning is important.
The SSR provides both a simple hold‑down timer and a dynamic damping mechanism. The hold-down-time parameter enforces a minimum interval before BFD change notifications are propagated. Dynamic damping, enabled with dynamic-damping and bounded by maximum-hold-down-time, automatically lengthens the effective hold‑down interval when frequent flapping is observed.
With a non‑zero hold-down-time, BFD detects and tracks state changes internally, but delays notifying the rest of the system for at least that duration. This ensures that very brief outages do not cause immediate failover. When dynamic-damping is enabled, BFD monitors the rate of up/down transitions. If it detects that a particular path is flapping frequently, it increases the hold‑down interval, up to the configured maximum-hold-down-time, to limit the rate at which changes are propagated.
During congestion, damping and hold‑down help shield higher‑level components from transient BFD changes. A short burst of loss that briefly forces BFD to consider a path down may never be exposed to routing or load‑balancing logic if the path recovers before the hold‑down timer expires. This reduces routing churn and avoids unnecessary failovers that would not materially improve user experience.
You can configure damping parameters at the router, neighborhood, or adjacency level. This allows you to tune behavior for specific paths that are known to be noisy or to traverse congested environments, without globally reducing responsiveness.
Routing BFD Parameters and Congestion
Routing BFD uses the same fundamental concepts:
- The
enableflag turns BFD on or off for an OSPF interface or BGP neighbor. required-min-rx-interval: Specifies the minimum receive interval in milliseconds.desired-tx-interval: Specifies the transmit interval in milliseconds.multiplier: Specifies the number of missed packets before the session is considered down.
Example Configuration
authority
router A
routing default-instance
ospf 2
area 0.0.0.0
interface T217_DUT2 inet20
node T217_DUT2
interface inet20
bfd
enable true
required-min-rx-interval 500
desired-tx-interval 500
multiplier 3
exit
routing-protocol bgp
neighbor 1.0.0.11
bfd
enable true
required-min-rx-interval 500
desired-tx-interval 500
multiplier 5
exit
exit
exit
exit
exit
exit
Routing BFD packets use standard IETF ports and are forwarded through the SSR data plane. When the SSR interface is congested, the packets are queued and potentially dropped alongside other traffic in their class. If enough BFD packets are missed, the session times out. When that happens, OSPF or BGP brings down the adjacency and reconverges, potentially selecting alternate paths.
You can tune Routing BFD in the same way as SSR BFD. Larger multipliers and intervals make BFD less sensitive to transient congestion at the cost of slower failure detection. Smaller values provide faster failure detection, but are more likely to trigger routing changes in response to short‑lived congestion.
Traffic Engineering, Queues, and BFD Priority
Traffic engineering (TE) and service policies determine how flows, including BFD, are placed into queues on a given interface. Out of the box, SSR BFD does not have a dedicated control‑plane queue. Instead, BFD packets are classified into services and mapped into traffic classes using the same mechanisms as user traffic.
The service that BFD matches is an internally-generated service that the SSR creates automatically when it establishes a peering relationship with another SSR. You will not find it in the configuration; it exists only in the forwarding plane and appears with surrounding braces (for example, {peer-name}) in the output of show sessions. Because no service-policy is attached to these auto-generated services, BFD traffic receives the system default treatment, which is effectively best‑effort.
To influence TE prioritization for BFD, create an explicit user-defined service that matches the BFD ports and associate it with a service-policy that references your desired TE class:
- For SSR BFD, match UDP destination port
1280. - For Routing BFD (OSPF/BGP), match UDP destination ports
3784(single‑hop),4784(multi‑hop), and3785(echo).
Because the SSR uses most-specific-match when selecting services, a user-defined service with a transport and port constraint will take precedence over the broader internally-generated service, allowing you to assign a service-policy with an explicit service-class and traffic-class. Set generated false on the service route if you want to ensure the conductor does not overwrite your configuration. For an example, see Marking BFD and Using DSCP Steering later in this document.
The SSR treats BFD with relatively low priority. If BFD packets are given an extremely high priority, they may not experience the same congestion as normal traffic. Echo mode tests would then under‑report latency, jitter, and loss relative to what best‑effort traffic sees. Decisions based on those measurements—for example, SLA‑based path selection—could be misleading.
If you map BFD into a very high‑priority TE class, you will tend to preserve BFD sessions through significant congestion, but BFD’s metrics will become less representative of user experience. If you map BFD into a very low‑priority class, you may cause BFD to declare paths down quickly under load, sometimes before higher‑priority application traffic is severely affected.
In most deployments, the recommended approach is to align BFD’s priority with the class of traffic you care most about, or to keep it at a modest, non‑privileged priority, and rely on timers and damping to control sensitivity rather than extreme queue priority.
Behavior When Intermediate Devices Are Congested
When congestion occurs on devices between SSR routers such as provider routers, firewalls, or tunnel endpoints, those devices usually classify and prioritize traffic based on DSCP. BFD packets, being ordinary IP/UDP packets, are subject to this behavior.
The SSR treats BFD with relatively low priority and assumes that BFD should experience the same class‑of‑service as typical traffic. This design keeps BFD’s view of path quality consistent with user experience.
Routing BFD packets inherit whatever DSCP marking is applied to traffic from the relevant interface, unless you specifically re‑mark them using QoS policies.
How Congested Transit Devices Affect BFD Sessions
On a congested transit device that uses DSCP‑based QoS, different classes are typically handled with different queueing and drop policies.
If BFD’s DSCP maps to a low or best‑effort class, BFD packets share queues with similarly marked traffic. When congestion occurs, these queues may experience increased delay and loss. BFD sessions then time out in a way that closely mirrors the experience of user flows in the same class. This is often desirable, because BFD’s view of path health remains aligned with what users are seeing.
If BFD’s DSCP maps to a high‑priority class, transit devices attempt to preserve BFD packets even under load. BFD sessions may remain up and report good latency and jitter while lower‑priority flows are experiencing significant delay or loss. This is appropriate if your primary goal is to keep control‑plane adjacencies up through congestion, but it means that BFD becomes less of a quality probe and more of a pure reachability check.
- If the priority is adjacency stability, assign BFD to a high‑priority DSCP and configure transit QoS to prefer it.
- If the priority is measuring effective application experience, keep BFD in or near the application’s QoS class, allowing it to see the same congestion.
In either case, coordinate DSCP values with any provider networks involved, so that the chosen markings map to the intended classes.
Example: BFD DSCP and SSR traffic-class mapping
The SSR ships with factory-default service-class objects that map DSCP values to a traffic-class. The following table shows representative values and their effect on BFD queuing:
BFD dscp value | DSCP name | SSR factory service-class | SSR traffic-class | Typical transit queue behavior |
|---|---|---|---|---|
0 (default) | Best Effort (BE) | Standard | best-effort | Shares queues with bulk/default traffic; dropped first under congestion |
18 | AF21 | LowLatencyData | low | Lower-priority assured forwarding; modest protection |
26 | AF31 | MultimediaStreaming | medium | Medium-priority assured forwarding; better protection |
46 | EF (Expedited Forwarding) | Telephony | high | Strict-priority or near-priority queue; highest protection |
BFD’s DSCP value is set using the dscp field under the bfd hierarchy at the router, neighborhood, or adjacency level. For example, to mark BFD packets with AF21 (decimal 18) on a specific adjacency:
authority
router branch1
node node1
device-interface wan
network-interface wan0
adjacency
ip-address 203.0.113.1
bfd
dscp 18
exit
exit
exit
exit
exit
exit
exit
To apply the same marking to all peers within a neighborhood (affecting all generated adjacencies on commit):
authority
router branch1
node node1
device-interface wan
network-interface wan0
neighborhood internet
bfd
dscp 18
exit
exit
exit
exit
exit
exit
exit
Changing the BFD dscp value controls the marking applied to outbound BFD packets, which influences how transit devices and the remote SSR peer queue them. To also change SSR‑local TE queuing for BFD, create a user‑defined service matching the BFD ports and assign a service-policy with the desired service-class, as described in Traffic Engineering, Queues, and BFD Priority.
Influencing BFD Behavior with DSCP and DSCP Steering
DSCP steering is most useful when traffic is encapsulated, such as inside IPsec or GTP tunnels. In those cases, traditional five‑tuple classification may not distinguish between different inner flows, but the DSCP bit in the outer header remains visible.
DSCP steering is configured in two parts:
On the network-interface, enable dscp-steering and specify the transport protocol and port-range that identify the tunnel traffic (for example, UDP 4500 for IPsec NAT‑T, or ESP for native IPsec).
In the service configuration, define a parent service that matches the tunnel endpoint, and one or more child services with dscp-range values that capture the DSCP values you want to steer.
When a packet arrives, the SSR matches it to the parent service based on the tunnel IP and protocol, then uses its DSCP value to select the appropriate child service. Different child services can then have different service routes, TE policies, or access policies, allowing finer‑grained control inside a shared tunnel.
Marking BFD and Using DSCP Steering
To influence how congestion affects BFD traffic as it passes through tunnels or specific parts of the network, you can combine DSCP marking for BFD packets with DSCP steering.
First, decide what DSCP policy you want for BFD. If you want BFD to closely track best‑effort traffic, you can leave it with a default or low DSCP value. In that case, BFD packets share the same QoS class as typical traffic, and DSCP steering rules that apply to that DSCP range will apply to BFD as well. If you want BFD to have greater protection, you can mark it to a DSCP value associated with a higher‑priority class that is recognized by your provider or by your own transit routers.
Next, configure classification on the SSR to mark BFD packets. For SSR BFD, match UDP port 1280. For Routing BFD, match UDP ports 3784, 4784, and 3785. Using your existing QoS framework, apply the desired DSCP marking to those flows.
If BFD packets traverse an IPsec or GTP tunnel where DSCP steering is enabled, configure dscp-steering on the relevant network-interface and set up the parent and child services accordingly. For example:
network-interface wan0
name wan0
dscp-steering
enabled true
transport
protocol udp
port-range
start-port 4500
exit
exit
exit
exit
Parent tunnel service:
service tunnel
name tunnel
description "IPSec Tunnel"
scope public
security internal
address 5.5.5.100/0
access-policy red
source red
permission allow
exit
exit
Child service for BFD and related high‑priority traffic:
service bfd-priority.tunnel
name bfd-priority.tunnel
description "BFD and associated high-priority traffic within this tunnel"
scope public
security internal
dscp-range 14
start-value 14
exit
exit
If BFD is marked with DSCP 14, traffic will match bfd-priority.tunnel and can be associated with specific service routes or TE policies. This allows you to steer BFD over different paths or ensure that it shares the same path as a particular class of application traffic.
All DSCP values you expect to see within a given tunnel must be covered across the child services’ dscp-range entries. If a packet arrives with a DSCP that is not included in any dscp-range, the SSR may be unable to find a valid service path and report No ServicePaths available for that traffic.
Finally, coordinate BFD’s DSCP values with your transit QoS policies so that those markings map to the intended classes in provider networks and on your own intermediate routers.
Design Trade‑Offs
When you use DSCP and DSCP steering to influence BFD behavior, you balance two competing goals: stability of control and fidelity of measurement.
If you assign BFD to a high‑priority DSCP and configure transit devices to highly prioritize that class, BFD sessions are more likely to remain stable during periods of congestion. Routing adjacencies and SSR peer paths that rely on BFD are less likely to flap. However, BFD echo‑mode measurements will no longer reflect the conditions experienced by lower‑priority application flows. You risk retaining a path that looks healthy to BFD but is problematic for user traffic.
If you keep BFD at a lower or default DSCP, or align its DSCP with that of the applications you care most about, BFD’s behavior under congestion will better reflect actual user experience. BFD may declare paths down when those applications degrade, triggering rerouting or failover. To prevent excessive flapping in this case, you rely on BFD damping and carefully tuned timers rather than on QoS priority.
In all designs, it is important that BFD’s DSCP markings and any DSCP steering rules are intentional and well understood. Unintended re‑marking or misaligned QoS classes can make BFD either too fragile or too insulated from the network conditions you are trying to observe.