Understanding RMON for Monitoring Service Quality

 

Health and performance monitoring can benefit from the remote monitoring of SNMP variables by the local SNMP agents running on each router. The SNMP agents compare MIB values against predefined thresholds and generate exception alarms without the need for polling by a central SNMP management platform. This is an effective mechanism for proactive management, as long as the thresholds have baselines determined and set correctly. For more information, see RFC 2819, Remote Network Monitoring MIB.

This topic includes the following sections:

Setting Thresholds

By setting a rising and a falling threshold for a monitored variable, you can be alerted whenever the value of the variable falls outside of the allowable operational range. (See Figure 1.)

Figure 1: Setting Thresholds
Setting Thresholds

Events are only generated when the threshold is first crossed in any one direction rather than after each sample period. For example, if a rising threshold crossing event is raised, no more threshold crossing events will occur until a corresponding falling event. This considerably reduces the quantity of alarms that are produced by the system, making it easier for operations staff to react when alarms do occur.

To configure remote monitoring, specify the following pieces of information:

  • The variable to be monitored (by its SNMP object identifier)

  • The length of time between each inspection

  • A rising threshold

  • A falling threshold

  • A rising event

  • A falling event

Before you can successfully configure remote monitoring, you should identify what variables need to be monitored and their allowable operational range. This requires some period of baselining to determine the allowable operational ranges. An initial baseline period of at least three months is not unusual when first identifying the operational ranges and defining thresholds, but baseline monitoring should continue over the life span of each monitored variable.

RMON Command-Line Interface

Junos OS provides two mechanisms you use to control the Remote Monitoring agent on the router: command-line interface (CLI) and SNMP. To configure an RMON entry using the CLI, include the following statements at the [edit snmp] hierarchy level:

If you do not have CLI access, you can configure remote monitoring using the SNMP Manager or management application, assuming SNMP access has been granted. (See Table 1.) To configure RMON using SNMP, perform SNMP Set requests to the RMON event and alarm tables.

RMON Event Table

Set up an event for each type that you want to generate. For example, you could have two generic events, rising and falling, or many different events for each variable that is being monitored (for example, temperature rising event, temperature falling event, firewall hit event, interface utilization event, and so on). Once the events have been configured, you do not need to update them.

Table 1: RMON Event Table

Field

Description

eventDescription

Text description of this event

eventType

Type of event (for example, log, trap, or log and trap)

eventCommunity

Trap group to which to send this event (as defined in the Junos OS configuration, which is not the same as the community)

eventOwner

Entity (for example, manager) that created this event

eventStatus

Status of this row (for example, valid, invalid, or createRequest)

RMON Alarm Table

The RMON alarm table stores the SNMP object identifiers (including their instances) of the variables that are being monitored, together with any rising and falling thresholds and their corresponding event indexes. To create an RMON request, specify the fields shown in Table 2.

Table 2: RMON Alarm Table

Field

Description

alarmStatus

Status of this row (for example, valid, invalid, or createRequest)

alarmInterval

Sampling period (in seconds) of the monitored variable

alarmVariable

OID (and instance) of the variable to be monitored

alarmValue

Actual value of the sampled variable

alarmSampleType

Sample type (absolute or delta changes)

alarmStartupAlarm

Initial alarm (rising, falling, or either)

alarmRisingThreshold

Rising threshold against which to compare the value

alarmFallingThreshold

Falling threshold against which to compare the value

alarmRisingEventIndex

Index (row) of the rising event in the event table

alarmFallingEventIndex

Index (row) of the falling event in the event table

Both the alarmStatus and eventStatus fields are entryStatus primitives, as defined in RFC 2579, Textual Conventions for SMIv2.

Troubleshooting RMON

You troubleshoot the RMON agent, rmopd, that runs on the router by inspecting the contents of the Juniper Networks enterprise RMON MIB, jnxRmon, which provides the extensions listed in Table 3 to the RFC 2819 alarmTable.

Table 3: jnxRmon Alarm Extensions

Field

Description

jnxRmonAlarmGetFailCnt

Number of times the internal Get request for the variable failed

jnxRmonAlarmGetFailTime

Value of sysUpTime when the last failure occurred

jnxRmonAlarmGetFailReason

Reason why the Get request failed

jnxRmonAlarmGetOkTime

Value of sysUpTime when the variable moved out of failure state

jnxRmonAlarmState

Status of this alarm entry

Monitoring the extensions in this table provides clues as to why remote alarms may not behave as expected.