Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

Navigation
Guide That Contains This Content
[+] Expand All
[-] Collapse All

    Configuring Threshold Crossing Alerts

    Threshold Editor Overview

    Threshold alarms can be used to monitor the network against any number of user-defined SLAs or other production and performance requirements. When these SLAs or other requirements are breached, you are automatically notified by the event server, either through viewing the Event Browser or by receiving preconfigured notification e-mails.

    Threshold alarms can be triggered by periodic collections from the Traffic Collection Manager, or the Task Manager tasks Device SNMP Collection, Device Ping Collection, and Device SLA Collection. For each threshold alarm, the Data Gateway Server (DGS) will examine incoming data against all applicable threshold alarm rules. If any data matches a threshold alarm rule, the DGS server will post an event to the event server with the parameters specified in the threshold alarm. In the Threshold Editor, these rules are referred to as production rules. The DGS processes traffic data from the data collector. The DGS log contains detailed information about the data objects and messages from the data collector. The detail level of the log is controlled by the dgs.log.properties file in /u/wandl/db/config.

    Note: In IP/MPLSView Release 6.3.0, Data Collector is renamed Traffic Data Collector.

    To open the threshold editor, from the Live Network select Fault > Event Options > Edit Threshold Alarms. Figure 1 shows the Threshold Editor window.

    Figure 1: Threshold Editor

    Threshold Editor

    When the threshold editor is opened for the first time, the tree in the left pane is collapsed, hiding all production rules. Double-click an item or click the plus sign (+) to the left of the item to display the elements beneath it. This hierarchy is comprised of the element type, followed by group/scope, and the actual production rules.

    Interpreting the Threshold Editor

    At the top level is the Element Type for which the rule will apply: Interface, Node, Tunnel, CPUStats, LSPPingStats, LatencyStats, PingStats, and SLAStats.

    • Interface: Rules can be defined in this section for interface-related properties such as bandwidth and ingress and egress utilizations.
    • Node: Rules can be defined in this section for node-related properties such as system up time, last up time, AAA, accounting, authentication, and sessions. These additional properties for AAA and sessions are related to wireless collection data and may or may not apply to all device types.
    • Tunnel: Rules can be defined in this section for LSP tunnel-related properties such as the delta in the ingress bytes.
    • CPUStats: Rules can be defined in this section for CPU and memory stats such as CPU temperature, CPU utilization, memory used, total memory, and memory utilization.
    • LSPPingStats: Rules can be defined in this section for LSP ping stats on average, max, min, and standard deviation values.
    • LatencyStats: Rules can be defined in this section for latency stats on average, max, min, and standard deviation values.
    • PingStats: Rules can be defined in this section for ping stats on average, max, min, and loss percentage values.
    • SLAStats: Rules can be defined in this section for SLA stats such as jitter, packet loss, packet timeout, and latency.

    Following the element type, the next level is the scope, which defines the group of interfaces to which the threshold rule(s) will be applied. An include condition can be specified to filter for only interfaces matching some user-specified criteria. An exclude condition can additionally be specified to exclude interfaces with some user-specified criteria. If no fields are specified for the scope, the rules of this scope will be applied to all elements of the given type. For example, a scope can be created underneath the Interface element type that only considers Fast Ethernet interfaces. Figure 2 shows the threshold editor scope window.

    Figure 2: Threshold Editor Scope

    Threshold Editor Scope

    Under the scope are the actual threshold rules themselves. Specify the production name, the actual rule, a severity level, and a description. For example, the rule can be created to generate a threshold event when the interface utilization exceeds a particular percentage. Figure 3 shows an example threshold rule.

    Figure 3: Example Threshold Rule

    Example Threshold Rule

    Creating Threshold Crossing Alerts

    To create a new threshold crossing alert:

    1. For the desired element type, create a scope identifying a subgroup of elements in which to place the rule. The scope can be used, for example, to filter only Fast Ethernet interfaces, or events at a particular node. See “Creating a New Scope.”
    2. Create the rule itself. See “Creating a New Rule.”

    Creating a New Scope


    To create a new scope, first select the upper-level tree item under which the group will be created. Then either click the Create button in the top toolbar, or right-click the selected item and select Create.

    This will create a new group under that item. Select the new group and fill in the fields for the new group on the right pane. To enter text into a field, first double-click the field to enable editing of the field.

    • Scope Name (Required)—Describes the scope of the rules contained within the group. Do not include any spaces in the name. Optionally, enter a description of the scope in the Description field.
    • Include and Exclude Conditions—Preliminary filters for all rules within the group. Only data matching these conditions will be considered by the rules within the group. For example, you could set “name ~= fe” in the Include condition for an Interface scope to only consider Fast Ethernet interfaces. To edit these conditions, right-click at the beginning of the field to open the Condition and Rule Builder. For more information on how to define conditions, see “Defining Conditions and Rules.” If you do not require any filtering, leave these fields blank.
    • Is Active—Activate or deactivate the scope and the production rules underneath it. Only if both the scope and production rule are activated will the threshold event be generated.
    • Production Count—Number of rules within the group.

    Creating a New Rule


    To create a new rule underneath a scope, first select the scope under which the new rule will be created. Then either click the Create New Production Rule button in the top toolbar, or right-click the selected item and select Create. This will create a new rule under the selected group.

    • Production Name—(Required) Describes the threshold rule. Do not include any spaces in the name.
    • Production Rule—(Required) Defines the threshold crossing alert. If incoming data matches this rule, it will trigger the threshold event. Right-click at the beginning of the field to open the Condition and Rule Builder. An example rule for a production rule underneath the Interface scope is ingressUtil > 75 || egressUtil > 75. For more information about how to define conditions, see “Defining Conditions and Rules.”
    • Is Active—Activates or deactivates the production rule. Only if both the scope containing the production rule and the production rule are activated will the threshold event be generated.
    • Event Type—Type of event triggered by this rule, which is displayed in the Event Browser when the threshold crossing alert is created. The default is ThresholdEvent and does not need to be changed. It is helpful to mark the events with more descriptive event types, such as ThresholdUtilizationEvent and ThresholdMemoryEvent.
    • Severity—Configures the severity of the event. This severity can later be displayed in the Event Browser when the Threshold Event is triggered.

      The selection is used to

    • Source ID—Displays as the source of the event triggered by this rule. This field corresponds to the Source ID field in the Event Browser.
    • Description Template—Describes the event triggered by this threshold rule. This is the primary means of specifying threshold event details in the Event Browser. The template allows for specifying keys and dynamic values by enclosing them within square brackets []. For a list of available suggestions while typing in the Description template field, right-click in the beginning of the field. For example, for a rule that triggers an event when ingress utilization or egress utilization exceed 75 percent, the following template may be used:
      [deviceID]: [name]: ingress util [ingressUtil] or egress util [egressUtil] greater than 75%

    Triggering Threshold Alarms

    Note that to trigger the threshold alarm, the corresponding collection (using the Task Manager or Traffic Collection Manager) should be scheduled on a recurring basis. For more information about scheduling the following tasks using Task Manager, see Task Manager.

    • For CPUStats, see Device SNMP Collection.
    • For LSPPingStats, see LSP Ping Collection.
    • For LatencyStats, see Link Latency Collection.
    • For PingStats, see Device Ping Collection.
    • For SLAStats, see Device SLA Collection.

    Defining Conditions and Rules

    In the Condition and Rule Builder, select the desired key(s) in the Attribute column. Click the column header values to edit the logical operators and properties. An optional Consecutive Occurrences field allows you to specify the number of consecutive occurrences before the rule is triggered. Click OK to build the rule syntax. Figure 4 shows an example for building threshold conditions and rules.

    Figure 4: Threshold Conditions and Rules Builder

    Threshold Conditions and
Rules Builder

    Alternatively, the Include and Exclude condition or Production rule syntax can be typed into the field instead of using the Condition and Rule Builder. Group conditions and production rules must be entered in the form of logical expressions with a predefined set of keys. For example, the following condition matches when either ingress utilization or egress utilization is greater than or equal to 75 percent: “ingressUtil >= 75 || egressUtil >= 75”.

    • For a list of available keys while editing the condition or rule field, right-click for a list of suggestions, or review the Available Keys listed below. This list may be different for different types of elements. If unsure of where to start, right-click at the beginning of a field to see all possible keys. Remember that the field must first be activated for editing by double-clicking the field.
    • The following are the supported logical operators for reference:== (equals), != (does not equal), ~= (equals using regular expression), && (and), || (or), < (less than), > (greater than), <= (less than or equal), and >= (greater than or equal).
    • Note that all conditions and rules are case sensitive, and spaces should be used as delimiters between keywords, values, and logical operators. Additionally, quotes (““) should be placed around string values, for example, IPAddress == “1.2.3.4.”.
    • If an integer value is specified for the utilization, the traffic utilization will be compared as integers. To compare using floating numbers, specify the number as a floating number. For example, “ingressUtil > 75.0” instead of “ingressUtil > 75”.

    Consecutive Occurrences


    The special operator “&=” is used to test for consecutive occurrences of a condition. For example, to test that the ingress or egress utilization has been greater than 75 percent for 3 times in a row, you could use the following expression: (ingressUtil >= 75 || egressUtil >= 75) &= 3


    Available Keys


    Below are a list of the attributes for Interface, Node, and Tunnel elements.

    Note that utilization values are specified in percentages (for example, specify 30 for 30 percent).

    See “Defining Conditions and Rules” for the syntax involving brackets and units.


    Common Attributes


    • deviceID: The hostname of the device associated with the element. For the Node element type, this is the same as the name. For the Interface element type, this is the node that contains the interface. For the Tunnel element type, this is the head-end of the tunnel.
    • name: The element’s name. For the Node element type, this is the hostname. For the Interface element type, this is the interface name. For the Tunnel element type, this is the tunnel’s name.
    • type: The element type (Node, Interface, Tunnel).
    • IPAddress: The IP address for the element.

    Interface Attributes:


    • bandwidth: The interface bandwidth. Here, g, m, k, are permitted to indicate the units, for example, 100m for 100 Mbps.
    • ingressBytesDelta, egressBytesDelta: The interface ingress/egress traffic in bytes per second.
    • ingressUtil, egressUtil: Specify an integer value for percentage, for example, 30 for 30 percent.
    • ingressErrorDelta, egressErrorDelta: The number of inbound/outbound packets that contained errors per second.
    • ingressDiscardDelta, egressDiscardDelta: The number of inbound/outbound packets that are discarded per second.

    Node Attributes


    • nodeType: Hardware type (for example, M5 for Juniper M5, CISCO) used for SLA status data.
    • sysUptime, lastUptime: Unit is in hundredths of a second.

    Tunnel Attributes


    • ingressBytesDelta: The tunnel traffic in bytes per second.

    CPU Stats Attributes


    • cpuTemp: CPU temperature.
    • cpuUtil: CPU utilization.
    • memTotal: Total memory.
    • memUsed: Used memory.
    • memUtil: Memory utilization.

    LSP Ping Stats Attributes


    • lsppingAvg: Average LSP ping value.
    • lsppingMax: Maximum LSP ping value.
    • lsppingMin: Minimum LSP ping value.
    • lsppingSD: Standard deviation LSP ping value.

    Latency Stats Attributes


    • latencyAvg: Average latency value.
    • latencyMax: Maximum latency value.
    • latencyMin: Minimum latency value.
    • latencySD: Standard deviation latency value.

    Ping Stats Attributes


    • pingAvg: Average ping value.
    • pingMax: Maximum ping value.
    • pingMin: Minimum ping value.
    • pingLossPercent: Ping loss percentage.

    SLA Stats Attributes


    • slaDNSError, slaDNSRoundTrip, slaTimeOut
    • slaEgressLatencyAvg, slaEgressLatencyMax, slaEgressLatencyMin
    • slaEgressNegJitterAvg, slaEgressNegJitterMax, slaEgressNegJitterMin
    • slaEgressPacketLoss
    • slaEgressPosJitterAvg, slaEgressPosJitterMax, slaEgressPosJitterMin
    • slaEgressRoundTripAvg, slaEgressRoundTripMax, slaEgressRoundTripMin
    • slaHTTPTransactionError, slaHTTPTransactionRoundTrip,

      slaHTTPTransactionTimeOut, slaHTTPTransactionTimeToFirstByte

    • slaIngressLatencyAvg, slaIngressLatencyMax, slaIngressLatencyMin
    • slaIngressNegJitterAvg, slaIngressNegJitterMax, slaIngressNegJitterMin
    • slaIngressPacketLoss
    • slaIngressPosJitterAvg, slaIngressPosJitterMax, slaIngressPosJitterMin
    • slaIngressRoundTripAvg, slaIngressRoundTripMax, slaIngressRoundTripMin
    • slaPacketOutofSequence, slaPacketTimeout
    • slaRoundTripAvg, slaRoundTripMax, slaRoundTripMin
    • slaTCPConnectionError, slaTCPConnectionRoundTrip,

      slaTCPConnectionTimeOut

    • slaUnknownPacketLoss

    Table 1: Additional Examples

    Element Type

    Scope

    Production Rule

    Explanation

    Interface

    Exclude condition: name ~= fe || name ~= ge || name ~= Ethernet

    ingressUtil > 50.0 || egressUtil > 50.0

    Generates alarm if non-Ethernet links have utilization over 50 percent.

    CPUStats

    Include condition: deviceID== “NWK”

    cpuUtil > 90

    Generates alarm if CPU utilization on router NWK exceeds 90 percent.

    Tunnel

     

    ingressBytesDelta > 8000

    Generates alarm if traffic is over 8 KBps = 64 Kbps.

    Modified: 2017-04-02