Control Plane DDoS Protection Flow Detection Overview
Flow detection is an enhancement to control plane DDoS protection that supplements the DDoS policer hierarchies; it is part of a complete control plane DDoS protection solution. Flow detection uses a limited amount of hardware resources to monitor the arrival rate of host-bound flows of control traffic. Flow detection is much more scalable than a solution based on filter policers. Filter policers track all flows, which consumes a considerable amount of resources. In contrast, flow detection only tracks flows it identifies as suspicious, using far fewer resources to do so.
The flow detection application has two interrelated components, detection and tracking. Detection is the process where flows suspected of being improper are identified and subsequently controlled. Tracking is the process where flows are tracked to determine whether they are truly hostile and when these flows recover to within acceptable limits.
Flow Detection and Control
Flow detection is disabled by default. When you enable it at the [edit system ddos-protection global] hierarchy level, the application begins monitoring control traffic flows when a control plane DDoS protection policer is violated for almost all protocol groups and packet types. In addition to enabling flow detection globally, you can configure its operation mode—that is, whether it is automatically triggered by the violation of a DDoS protection policer (the default) or is always on—for almost all protocol groups and packet types. You can override the global configuration settings for individual protocol groups and packet types. Other than event report rates, all other characteristics of flow detection are configurable only at the level of individual packet types.
Enhanced Subscriber Management supports flow detection for control plane DDoS protection as of Junos OS Release 17.3R1.
You cannot enable flow detection globally for the following groups and packet type because they do not have typical Ethernet, IP, or IPv6 headers:
Protocol groups: fab-probe, frame-relay, inline-ka, isis, jfm, mlp, pfe-alive, pos, and services.
Packet type: unclassified in the ip-options protocol group.
Control flows are aggregated at three levels. The subscriber level is the finest grained of the three and consists of flows for individual subscriber sessions. The logical interface level aggregates multiple subscriber flows, so it is coarser grained and does not provide discrimination into individual subscriber flows. The physical interface level aggregates multiple logical interface flows, so it provides the coarsest view of traffic flows.
You can turn flow detection off or on at any of these levels. You can also configure whether it is automatically triggered by the violation of a DDoS protection policer or is always on. Flow detection begins at the finest-grained level that has detection configured to on or automatic.
When a flow arrives, flow detection checks whether the flow is already listed in a table of suspicious flows. A suspicious flow is one that exceeds the bandwidth allowed by default or configuration. If the flow is not in the table and the aggregation level flow detection mode is on, then flow detection lists the flow in the table. If the flow is not in the table and the flow detection mode is automatic, flow detection checks whether this flow is suspicious.
If the flow is suspicious, then it goes in the flow table. If the flow is not suspicious, then it is processed the same way at the next coarser aggregation level that has flow detection set to on. If none of the higher levels have detection on, then the flow continues to the DDoS protection packet policer for action, where it can be passed or dropped.
When the initial check finds the flow in the table, then the flow is dropped, policed, or kept, depending on the control mode setting for that aggregation level. All packets in dropped flows are dropped. In policed flows, packets are dropped until the flow is within the acceptable bandwidth for the aggregation level. Kept flows are passed along to the next aggregation level for processing.
The flow detection application tracks flows that have been listed in the suspicious flow table. It periodically checks each entry in the table to determine whether the listed flow is still suspicious (violating the bandwidth). If a suspicious flow has continuously violated the bandwidth since it was inserted in the table for a period greater than the configurable flow detection period, then it is considered to be a culprit flow rather than merely suspicious. However, if the bandwidth has been violated for less than the detection period, the violation is treated as a false positive. Flow detection considers the flow to be safe and stops tracking it (deletes it from the table).
You can enable a timeout feature that suppresses culprit flows for a configurable timeout period, during which the flow is kept in the flow table. (Suppression is the default behavior, but the flow detection action can be changed by the flow level control configuration.) If the check of listed flows finds one for which the timeout is enabled and the timeout period has expired, then the flow has timed out and it is removed from the flow table.
If the timeout has not yet expired or if the timeout feature is not enabled, then the application performs a recovery check. If the time since the flow last violated the bandwidth is longer than the configurable recovery period, the flow has recovered and is removed from the flow table. If the time since last violation is less than the recovery period, the flow is kept in the flow table.
By default, flow detection automatically generates system logs for a variety of events that occur during flow detection. The logs are referred to as reports in the flow detection CLI. All protocol groups and packet types are covered by default, but you can disable automatic logging for individual packet types. You can also configure the rate at which reports are sent, but this applies globally to all packet types.
Each report belongs to one of the following two types:
Flow reports—These reports are generated by events associated with the identification and tracking of culprit flows. Each report includes identifying information for the flow that experienced the event. This information is used to accurately maintain the flow table; flows are deleted or retained in the table based on the information in the report. Table 1 describes the event that triggers each flow report.
Table 1: Triggering Event for Flow Detection Reports
A suspicious flow is detected.
The timeout period expires for a culprit flow. Flow detection stops suppressing (or monitoring) the flow.
A culprit flow returns to within the bandwidth limit.
A culprit flow is cleared manually with a clear command or automatically as the result of suspicious flow monitoring shifting to a different aggregation level.
Control flows are aggregated to a coarser level. This event happens when the flow table nears capacity or when the flow cannot be found at a particular flow level and the next coarser level has to be searched.
Control flows are deaggregated to a finer level. This event happens when the flow table is not very full or when flow control is effective and the total arrival rate for the flow at the policer for the packet type is below its bandwidth for a fixed, internal period.
Bandwidth violation reports—These reports are generated by events associated with the discovery of suspicious flows. Each report includes identifying information for the flow that experienced the event. This information is used to track the suspicious flow and identify flows that are placed in the flow table. Table 2 describes the event that triggers each violation report.
Table 2: Triggering Event for Bandwidth Violation Reports
The incoming traffic for a control protocol exceeded the configured bandwidth.
The incoming traffic for a violated control protocol returned to normal.
A report is sent only when triggered by an event; that is, there are no null or empty reports. Because the reports are made periodically, the only events of interest are ones that occur during the interval since the last report.