Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

 

HealthBot Pull-Model Ingest Methods

 

HealthBot currently supports the following pull-model sensors:

iAgent (CLI/NETCONF)

For all the benefits of the ’push’ data collection methods, some operational and state information is available only through CLI/VTY commands. iAgent fills this gap by taking advantage of NETCONF/SSH functionality to provide HeathBot with the ability to connect to a device, run commands, and capture the output.

iAgent sensors use NETCONF/SSH and YAML-based PyEZ tables and views to fetch the necessary data. Both structured (XML) and unstructured (VTY commands and CLI output) data is supported.

With iAgent, the HealthBot server makes requests over any available network interface, whether in-band or out-of band; and the device responds (when properly configured) with the requested data.

Starting in HealthBot Release 3.1.0, iAgent functionality is extended to third party devices. When adding a device, you can choose Other Vendor from the Vendor pull-down. This adds the Vendor Name text field below the Vendor pull-down. Then you fill in the iAgent Port Number, Vendor Name, and OS name fields highlighted in Figure 1 to allow iAgent connections to non-Juniper devices.

Note

Refer to vendor documentation to understand how to configure third-party vendor devices to allow these connections.

Figure 1: Add Third-Party Device
Add Third-Party Device

Using Netmiko, HealthBot makes persistent SSH connections over the selected port to the third-party device. To gather device information, HealthBot sends CLI commands over SSH and receives string blobs back as output. The string blobs are then parsed through TextFSM, using ntc-templates into JSON format and then stored in the database. Default templates are located at /srv/salt/_textfsm. A repository of ntc-templates for network devices is available here: NTC Templates. For advanced users who need a template which does not exist, you can create your own templates and upload them to HealthBot using the Upload Rule Files button on the Configuration > Rules page. User defined templates are stored at /jfit/_textfsm. The files must end with the .textfsm suffix.

TextFSM is integrated into PyEZ’s table/view feature which is an integral part of iAgent.

Example: PaloAlto Panos– Show Running Security Policy

To see the running security policy on a Panos device, we need to:

  • Define a table/view for it

  • Gather the output by sending the needed CLI to the device over SSH

  • Generate JSON to store in HealthBot database

Define PyEZ Table/View

We need to define a PyEZ table that is used by the iAgent rule assigned to the Panos device. The following table definition lacks a view definition. Because of this, the entire output from the show running security-policy ends up getting stored in the database after processing.

(Optional) To store only a portion of the received data in HealthBot, you can define a view in the same file. The view tells HealthBot which fields to pay attention to.

Gather Output from Device

Using an iAgent rule that references the PyEZ table (or table/view) defined above, HealthBot sends the command show running security-policy to the device which produces the following output:

Generate JSON for Use in HealthBot Database

Since the device configuration specifies Palo Alto Networks as the vendor and Panos OS as the operating system, the TextFSM template used for this example would look like this:

When the template above is used by HealthBot to parse the output shown previously, the resulting JSON looks like:

SNMP

SNMP is a widely known and accepted network management protocol that many network device manufacturers, including Juniper Networks, provide for use with their devices. It is a polling type protocol where network devices that are properly configured make configuration, diagnostic, and event information available to collectors, which must also be properly configured and authenticated. The collectors poll devices by sending specifically structured requests, called get requests, to retrieve data.

HealthBot supports SNMP as a sensor type, using standard get requests to gather statistics from the device. HealthBot makes requests over any available interface, whether in-band or out-of-band, and the device responds (when configured) with the requested data.

Note

HealthBot does not currently support SNMP traps.

For information about SNMP as used on Junos OS devices, see Understanding SNMP Implementation in Junos OS.

The example below contains all of the configuration needed for HealthBot to successfully ingest SNMP data from a device or devices in a device group.

Example: Creating a Rule using SNMP Ingest

To illustrate how to configure and use an SNMP sensor, consider a scenario where you want to:

  • Monitor Routing Engine CPU, CPU average, and memory utilization for a device, using SNMP data

  • Create a rule with triggers that indicate when utilization for any of the above elements goes above 80%

To implement this scenario, you will need to complete the following activities:

The workflow is as follows:

CONFIGURE NETWORK DEVICES

Note

This example assumes you have already added your devices into HealthBot and assigned them to a device group.

Add SNMP configuration to the network device

If not already done, configure your network device(s) to accept SNMP get requests from Healthbot. For more details on configuring your Junos device, see the Network Device Requirements section of the HealthBot Installation Guide.

CREATE RULE, APPLY PLAYBOOK

Configure a rule using an SNMP sensor

You can now create a rule using SNMP as the sensor.

This rule includes multiple elements, as shown below:

  • An SNMP sensor to ingest data

  • Five fields extracting specific SNMP data of interest:

    • CPU utilization, memory utilization

    • CPU utilization averages - 1min, 5min, 15min

  • A field representing a static value, used as a threshold

    • Value provided by a variable

  • A field representing a description

    • Value provided by a variable; extracted from the SNMP messages

  • Five triggers, indicating when CPU, CPU average, and memory utilization is higher than the threshold value

  1. In the HealthBot GUI, click Configuration > Rules in the left-nav bar.
  2. On the Rules page, click the + Add Rule button.
  3. On the page that appears, in the top row of the rule window, set the rule name. In this example, rule name is check-system-cpu-memory-snmp.
  4. Add a description and synopsis if you wish.
  5. Click the + Add sensor button and enter the following parameters to configure the sensor, system-cpu-memory:
    • Name is user-defined

    • The sensor is using the Juniper SNMP MIB table jnxOperatingTable

    • HealthBot polls the device group for table data every 60 seconds

  6. Now move to the Variables tab, click the + Add variable button and enter the following parameters to configure the first variable, comp-name:
    • Matches any string that includes “Routing Engine”

    • Referenced later in field description

  7. Click the + Add variable buttononce more and enter the following parameters to configure the second variable, static-threshold:
    • Represents a (default) static value of “80”; in this case, 80%

    • Referenced later in field threshold

  8. Now move to the Fields tab, click the + Add field button and enter the following parameters to configure the first field, cpu-15min-avg:
    • Field names are user-defined

    • Extracts jnxOperating15MinLoadAvg value from SNMP table configured in the sensor

    • jnxOperating15MinLoadAvg - CPU Load Average (as a % value) over the last 15 minutes

  9. Click the + Add field button again and enter the following parameters to configure the second field, cpu-1min-avg:
    • Extracts jnxOperating1MinLoadAvg value from SNMP table

    • jnxOperating1MinLoadAvg - CPU Load Average (as a % value) over the last 1 minute

  10. Click the + Add field button again and enter the following parameters to configure the third field, cpu-5min-avg:
    • Extracts jnxOperating5MinLoadAvg value from SNMP table

    • jnxOperating5MinLoadAvg - CPU Load Average (as a % value) over the last 5 minutes

  11. Click the + Add field button again and enter the following parameters to configure the fourth field, description:
    • Extracts jnxOperatingDescr value from SNMP table

    • jnxOperatingDescr - name or description; for example, ”Routing Engine 0”, “FPC 0”, etc.

    • The expression references the variable comp-name; filters the data further to retain only the values that include the string “Routing Engine”

    • Matching values will act as keys; each key gets a colored block in device health view

  12. Click the + Add field button again and enter the following parameters to configure the fifth field, system-buffer-memory:
    • Extracts jnxOperatingBuffer value from SNMP table

    • jnxOperatingBuffer - buffer pool utilization (as a % value)

  13. Click the + Add field button again and enter the following parameters to configure the sixth field, system-cpu:
    • Extracts jnxOperatingCPU value from SNMP table

    • jnxOperatingCPU - CPU utilization (as a % value)

  14. Click the + Add field button once more and enter the following parameters to configure the seventh field, threshold:
    • The expression references the variable static-threshold, giving this field the (default) integer value “80”

    • Referenced later in triggers

  15. Now move to the Triggers tab, click the + Add trigger button and enter the following parameters to configure the first trigger, system-buffer:
    • Trigger names are user-defined

    • Trigger logic runs every 90 seconds

    • Evaluate terms in sequence; when a term’s conditions are met, show its color and message on the device health pages

    • When system memory buffer utilization (the value in field system-buffer-memory) is greater than 80 (the value in field threshold), set color to red and show related message

    • Otherwise, set color to green and show related message

  16. Click the click the + Add trigger button again and enter the following parameters to configure the second trigger, system-cpu:
    • Trigger logic runs every 90 seconds

    • When CPU utilization (the value in field system-cpu) is greater than 80 (the value in field threshold), set color to red and show related message

    • Otherwise, set color to green and show related message

  17. Click the click the + Add trigger button again and enter the following parameters to configure the third trigger, system-cpu-15min-average:
    • Trigger logic runs every 90 seconds

    • When CPU 15min utilization average (the value in field cpu-15min-avg) is greater than or equal to 80 (the value in field threshold), set color to red and show related message

    • Otherwise, set color to green and show related message

  18. Click the click the + Add trigger button again and enter the following parameters to configure the fourth trigger, system-cpu-1min-average:
    • Trigger logic runs every 90 seconds

    • When CPU 1min utilization average (the value in field cpu-1min-avg) is greater than or equal to 80 (the value in field threshold), set color to red and show related message

    • Otherwise, set color to green and show related message

  19. Click the click the + Add trigger button once more and enter the following parameters to configure the fifth trigger, called system-cpu-5min-average:
    • Trigger logic runs every 90 seconds

    • When CPU 5min utilization average (the value in field cpu-5min-avg) is greater than or equal to 80 (the value in field threshold), set color to red and show related message

    • Otherwise, set color to green and show related message

  20. At the upper right of the window, click the + Save & Deploy button.

Add the rule to a playbook

With the rule created, you can now add it to a playbook. For this example, create a new playbook to hold the new rule.

  1. Click Configuration > Playbooks in the left-nav bar.
  2. On the Playbooks page, click the + Create Playbook button.
  3. On the page that appears, enter the following parameters:
  4. Click Save & Deploy.

Apply the playbook to a device group

To make use of the playbook, apply it to a device group.

  1. On the Playbooks page, click the Apply (Airplane) icon for the playbook you configured above.
  2. On the page that appears:
    • Enter a playbook instance name

    • Select the desired device group

    • (Optional) If desired, you can adjust the variables for this playbook instance to use different values than the defaults configured in the rule

    • Click Run Instance

  3. On the Playbooks page, confirm that the playbook instance is running. Note that the playbook instance may take some time to activate.

MONITOR

Monitor the devices

With the playbook applied, you can begin to monitor the devices.

  1. Click Monitor > Device Group Health in the left-nav bar. and
  2. Select the device group to which you applied the playbook from the Device Group pull-down menu.
  3. Select one or more of the devices to monitor.
  4. In the Tile View, hover your mouse over one of the external tiles.
    • external is the topic name under which the rule was created

    • Each colored block represents a key and its related values

    • The mouse-over window shows information related to the given key, with the triggers listed inside

  5. In the Table View, try out the various filters and sorting options.
    • Each trigger is listed as a KPI

Release History Table
Release
Description
Starting in HealthBot Release 3.1.0, iAgent functionality is extended to third party devices.