Help Center User GuideGetting StartedFAQRelease Notes
 
X
User Guide
Getting Started
FAQ
Release Notes
Contents  

HealthBot Rules and Playbooks

HealthBot is a highly programmable telemetry-based analytics application. With it, you can diagnose and root cause network issues, detect network anomalies, predict potential network issues, and create real-time remedies for any issues that come up. With HealthBot you can also customize your device and network health views. HealthBot gathers information from Junos devices using OpenConfig, iAgent, and native-gpb sensors as well as SNMP.

HealthBot uses rules, topics, and playbooks in order to extract the needed telemetry data from deployed Junos-based network devices.

HealthBot Rules

A rule is a package of components, or blocks, needed to extract specific information from the network or from a Junos device. Rules conform to a specifically tailored domain specific language (DSL) for analytics applications. The DSL is designed to allow rules to capture:

  • The minimum set of input data that the rule needs to be able to operate

  • The minimum set of telemetry sensors that need to be configured on the device(s)

  • The fields of interest from the configured sensors

  • The reporting or polling frequency

  • The set of triggers that operate on the collected data

  • The conditions or evaluations needed for triggers to kick in

  • The actions or notifications that need to be performed when a trigger kicks in

The details around rules, topics and playbooks are presented in the following sections.

Rules

Rules are meant to be free of any hard coding. Think of threshold values. If a threshold is hard coded, there is no easy way to customize it for a different customer or device that has different requirements. Therefore, rules are defined using parameterization to set the default values. This allows these parameters to be left at default or be customized by the operator at the time of deployment. Customization can be done at the device group or individual device level while applying the HealthBot Playbooks in which the individual rules are contained.

Rules that are device-centric are called device rules. Device components such as chassis, system, linecards, and interfaces are all addressed as HealthBot Topics in the rule definition. You can create sub-topics underneath any of the allowed topic names by appending .<sub-topic> to the topic name. For example, kernel.tcpip or system.cpu, Generally, device rules make use of sensors on the devices.

Rules that span multiple devices are called network rules. Network rules:

To deploy either type of rule, include the rule in a playbook that you then apply to a device group or network group.

Not all of the components that make up a rule are required for every rule. Whether or not a specific block is required in a rule definition depends on what sort of information you are trying to get to. Additionally, some rule components are not valid for network rules. Table 3 lists the components of a rule and provides a brief description of each one.

Table 3: Rule Components

Block

What it Does

Required in Device Rules?

Valid for Network Rules?

Sensors

The Sensors block is like the access method for getting at the data. There are four types of sensors available in HealthBot, OpenConfig, native-gpb, iAgent, and SNMP.

It defines what sensors need to be active on the device in order to get to the data fields on which the triggers eventually operate. Sensor names are referenced by the Fields.

OpenConfig and iAgent sensors require that a frequency be set for push interval or polling interval respectively. SNMP sensors also require you to set a frequency.

No–Rules can be created that only use a field reference from another rule or a vector with references from another rule. In these cases, rule-frequency must be explicitly defined.

No

Fields

The Fields block can be a pointer to a sensor, a reference to a field defined in another rule, a constant, or a formula. The default type is string.

Yes*-Fields contain the data on which the triggers operate.

Yes

Vectors

The Vectors block allows handling of lists, creating sets, and comparing elements amongst different sets. A vector is used to hold multiple values from one or more fields..

No

Yes

Variables

The Variables block allows you to pass values into rules. Invariant rule definitions are achieved through mustache-style templating like {{<placeholder-variable> }}. The placeholder-variable value is set in the rule by default or can be user-defined at deployment time.

No

No

Functions

The Functions block allows you to extend fields, triggers, and actions by creating prototype methods in external files written in languages like python. The functions block includes details on the file path, method to be accessed, and any arguments, including argument description and whether it is mandatory.

No

No

Triggers

The Triggers block operates on fields and are defined by one or more Terms. When the conditions of a Term are met, then the action defined in the Term is taken.

By default, triggers are evaluated every 10 seconds, unless explicitly configured for a different frequency.

By default, all triggers defined in a rule are evaluated in parallel.

Yes–Triggers enable rules to take action.

Yes

Note *The Fields block is not required for iAgent sensors.

HealthBot comes with a set of pre-defined rules. Starting with release 1.0.1, these rules cannot be changed or removed.

Sensors

As mentioned in Table 3, sensors can be of type iAgent, OpenConfig, native-gpb, or SNMP. When defining a sensor, you must specify sensor name, sensor type and frequency. The frequency is expressed in #s, #m, #h, #d, #w, #y where # is a number and s, m, h, d, w, y specifies seconds, minutes, hours, days, weeks, and years respecfively. For OpenConfig and native-gpb, the Sensor field identifies the sensor to be configured. In the case of iAgent-type sensors, the sensor name is taken from a YAML-formatted file that contains a table with the needed information.

When different rules have the same sensor defined, only one subscription is made per sensor. A key, consisting of sensor-name for OpenConfig and native-gpb sensors, and the tuple of file and table for iAgent sensors is used to identify the associated rule. When multiple sensors with the same sensor-name key have different frequencies defined, the lowest frequency is chosen for the sensor subscription.

Fields

There are 4 types of fields, as shown in Table 3. Table 4 shows more detail regarding each of the field types.

Table 4: Field Type Details

Field Type

Details

Sensor

Subscribing to a sensor typically provides access to multiple columns of data. For instance, subscribing to the OpenConfig interface sensor provides access to a bunch of information including counter related information such as:

/interfaces/counters/tx-bytes,

/interfaces/counters/rx-bytes,

/interfaces/counters/tx-packets,

/interfaces/counters/rx-packets,

/interfaces/counters/oper-state, etc.

Given the rather long names of paths in OpenConfig sensors, the Sensor definition within Fields allows for aliasing, and filtering. For single-sensor rules, the required set of Sensors for the Fields table are programmatically auto-imported from the raw table based on the triggers defined in the rule.

Reference

Triggers can only operate on Fields defined within that rule. In some cases, a Field might need to reference another Field or Trigger’s output defined in another Rule. This is achieved by referencing the other field or trigger, and applying additional filters. The referenced field or trigger is treated as a stream notification to the referencing field. References aren’t supported within the same rule.

References can also take a time-range option which picks the value, if available, from the time-range provided. Field references must always be unambiguous, so proper attention must be given to filtering the result to get just one value. If a reference receives multiple data points, or values, only the latest one is used.

Constant

A field defined as a constant is a fixed value which cannot be altered during the course of execution. HealthBot Constant types can be strings, integers, and doubles.

Formula

Raw sensor fields are the starting point for defining triggers. However, Triggers often work on derived fields defined through formulas by applying mathematical transformations.

Formulas can be pre-defined or user-defined. Pre-defined formulas are classified as aggregation formulas and non-aggregation formulas. Pre-defined aggregation formulas include: avg, min, max, count, sum, stddev, and dynamic-threshold.

All pre-defined formulas can operate on time ranges in order to work with historical data. If a time range is not specified, then the formula works on current data, specified as now.

Vectors

Vectors are useful in helping to gather multiple elements into a single rule. For example, using a vector you could gather all of the interface error fields The syntax for Vector is:

vector <vector-name>{
       path [$field-1 $field-2 .. $field-n];
       filter <list of specific element(s) to filter out from vector>;
       append <list of specific element(s) to be added to vector>;
}

$field-n can be field of type reference.

The fields used in defining vectors can be direct references to fields defined in other rules:

vector <vector-name>{
`     path [/device-group[device-group-name=<device-group>]\
/device[device-name=<device>]/topic[topic-name=<topic>]\
/rule[rule-name=<rule>]/field[<field-name>=<field-value>\
 AND|OR ...]/<field-name> ...];
       filter <list of specific element(s) to filter out from vector>;
     append <list of specific element(s) to be added to vector>;
}

This syntax allows for optional filtering through the <field-name>=<field-value> portion of the construct. Vectors can also take a time-range option, that picks the values from the time-range provided.

The following pre-defined formulas are supported on vectors:

Variables

Variables are defined during rule creation on the Variables page. This part of variable definition creates the default value that gets used if no specific value is set in the device group or on the device during deployment. For example, the check-interface-status rule has one variable called interface_name. The value set on the Variables page is a regular expression (regex), .*, that means all interfaces.

If applied as-is, the check-interface-status rule would provide interface status information about all the interfaces on all of the devices in the device group. While applying a playbook that contains this rule, you could override the default value at the device group or device level. This allows you flexibility when applying rules. The order of precedence is device value overrides device group value and device group value overrides the default value set in the rule.

It is highly recommended to supply default values for variables defined in device rules. All Juniper-supplied rules follow this recommendation. Default values must not be set for variables defined in network rules..

Functions

Functions are defined during rule creation on the Functions page. Defining a function here allows it to be used in Formulas associated with Fields and in the When and Then sections of Triggers. Functions used in the when clause of a trigger are known as user-defined conditions. These must return true or false. Functions used in the then clause of a trigger are known as user-defined actions.

Triggers

Triggers play a pivotal role in HealthBot rule definitions. They are the part of the rule that determines if and when any action is taken based on changes in available sensor data. Triggers are constructed in a when-this, then-that manner. As mentioned earlier, trigger actions are based on Terms. A Term is built with when clauses that watch for updates in field values and then clauses that initiate some action based on what changed. Multiple Terms can be created within a single trigger.

Evaluation of the when clauses in the Terms starts at the top of the list of terms and proceeds to the bottom. If a term is evaluated and no match is made, then the next term is evaluated. By default, evaluation proceeds in this manner until either a match is made or the bottom of the list is reached without a match.

Pre-defined operators that can be used in the when clause include:

Note For evaluated equations, the left-hand side and right-hand side of the equation are shortened to LHS and RHS, respectively in this document.

Using these operators in the when clause, creates a function known as a user-defined condition. These functions should always return true or false.

If evaluation of a term results in a match, then the action specified in the Then clause is taken. By default, processing of terms stops at this point. You can alter this flow by enabling the Evaluate next term button at the bottom of the Then clause. This causes HealthBot to continue term processing to create more complex decision-making capabilities like when-this and this, then that.

The following is a list of pre-defined actions available for use in the Then section:

HealthBot Topics

Network devices are made up of a number of components and systems from CPUs and memory to interfaces and protocol stacks and more. In HealthBot, Topics are the construct used to address those different device components. The Topics block is used to create name spaces that define what needs to be modeled. The Topics block consists of Rules blocks which in turn consist of the Fields blocks, Functions blocks, Triggers blocks, etc. See HealthBot Rules for details. Each rule created in HealthBot must be part of a topic. Juniper has curated a number of these system components into a list of Topics:

This is the allowed list of Topics for HealthBot. Any pre-defined rules provided by Juniper fit within one of these topics with the exception of external, The external topic is reserved for user-created rules.

In the HealtBot web GUI, when you create a new rule, the Topics field is automatically populated with the external topic name.

HealthBot Playbooks

In order to fully understand any given problem or situation on a network, it is often necessary to look at a number of different system components, topics, or key performance indicators (KPIs). HealthBot operates on playbooks, which are collections of rules for addressing a specific use case. Playbooks are the HealthBot element that gets applied, or run, on your device groups or network groups.

HealthBot comes with a set of pre-defined Playbooks. For example, the system-KPI playbook monitors the health of system parameters such as system-cpu-load-average, storage, system-memory, process-memory, etc. It then notifies the operator or takes corrective action in case any of the KPIs cross pre-set thresholds. Following is a list of Juniper-supplied Playbooks. Starting in HealthBot release 1.0.1, the pre-defined playbooks can not be changed or deleted.

You can create a playbook and include any rules want in it. You apply these playbooks to device groups. By default, all rules contained in a Playbook are applied to all of the devices in the device group. There is currently no way to change this behavior.

If your playbook definition includes network rules, then the playbook becomes a network playbook and can only be applied to network groups.

Help us to improve. Rate this article.
Feedback Received. Thank You!

Ask questions in TechWiki

Check documentation in TechLibrary

Rating by you:      
X

Additional Comments

800 characters remaining

May we contact you if necessary?

Name:
Email:

Need product assistance? Contact Juniper Support

Submit