Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

 
 

Using Custom Telemetry Data in an IBA Probe

SUMMARY This topic describes how to create an IBA probe and detect and store any anomalies in a historical database for reference.

In our walkthrough, we've created a custom telemetry collector service that defines the data you want to collect from your devices. Now let's ingest this data into IBA probes in your blueprint so that Apstra can visualize and analyze the data.

Create a Probe

First, we'll create a new probe in your deployed blueprint so that Apstra can ingest data from your custom telemetry collector. In this example, we'll focus on a minimal set of configurations for the simple use case of visualizing BFD session data and generating anomalies (alerts) when sessions are down.
Note:

Both data center and Freeform blueprints support IBA probes with the custom telemetry collection.

  1. From your blueprint, navigate to Analytics > Probes, and then click Create Probe > New Probe.
    User interface screenshot of a data platform showing Analytics tab with Probes section selected and Create Probe button highlighted.
  2. Enter a name and (optional) description (in this example, BFD-Example-Probe) and then click Add Processor.
    User interface for creating a new probe, with sections for naming, tagging, description, enabling, adding processor, or importing from JSON. Tabs: Dashboard, Analytics, Staged, Uncommitted, Active, Time Voyager.
  3. Select a processor type.
    For our example, we chose the Extensible Service Data Collector processor.
    User interface for adding a processor with fields: Processor Type dropdown (Extensible Service Data Collector selected), Processor Name text field (BFD Status entered), Output Stage Name text field (BFD Status entered), and a teal Add button. Red annotations highlight the dropdown and Add button.
  4. Click Add to add the processor to the probe.
    For more information about the different processors, see the Juniper Apstra User Guide.
  5. Click Create to create the probe and return to the table view.
  6. To the right of the Graph Query field click the Select a predefined graph query button, then select DC – All managed devices (any role) from the Predefined Query drop-down.
    This query determines the scope within the blueprint in which the telemetry collection is executed. This means if a device in your blueprint is not matched by the graph query, the telemetry collection service will not start for that device.
    User interface for updating a graph query with a predefined query dropdown, query code snippet, and highlighted Update button.

    The graph query specifically matches all system nodes in the graph database of your blueprint. Each managed device, such as a leaf switch or spine switch, shows as a system node in the graph.

    In the Predefined Query we selected above, the query matches all nodes of the type system, which in deploy mode has a role of leaf, access, spine, or superspine.

  7. Click Update to return to the table view.
    Configuration interface for telemetry system with graph query for nodes named "system" filtered by roles "leaf" or "spine". System ID maps query results. Service name "BFD", data type "Dynamic Text", 2-minute interval, execution count -1. Create Probe button visible.
  8. In the System ID field, enter system.system_id.
    This entry tells the probe that the graph query will match on your managed devices under the name system (name='system’).
    The attribute system_id on each system nodes refers to the system ID of each device. This attribute is what Apstra uses to uniquely identify each device.
  9. Select BDF from the Service name drop-down.
  10. Select the Data Type.
    • Select Dynamic Text if your telemetry service collects string as the value type.

    • Select Dynamic Number if the service collects integer as the value type.

    In our example, we chose Dynamic Text because the BDF session state contains the string values Up and Down.

  11. Click Create Probe.
  12. Navigate to the output stage of the data collector processor to verify that the probe is correctly ingesting data from your custom telemetry collector.
    Screenshot of network monitoring tool showing BFD probes status with system ID, neighbor IP, session status, and update time.
    Congratulations! You successfully create a probe!

Customize a Probe

So far we've created a working probe that collects the BFD state for every device in your network. Now, let’s explore a couple of useful customization options to fine-tune your probe.

Service Interval

The service interval determines how often your telemetry collection service fetches data from devices and ingests them into the probe.

The service interval is an important parameter to be aware of because an overly aggressive interval can cause excessive load on your devices. The optimal interval will depend on the data you are collecting. For example, a collector fetching the content of a large routing table with thousands of entries can cause a higher load than collecting the status of a handful of BFD sessions.

Dropdown menu labeled Service interval with options 1 Minute to 1 Hour; currently 1 Minute is selected.

Query Tag Filter

Another useful customization option is the Query Tag Filter. Let’s say you tagged some switches in your blueprint as storage for a specific monitoring use case. You can configure this filter to perform the telemetry collection only on devices with the matching tag as shown in the following example:

Query Tag Filter interface with dropdown for and/or operation. Node Name set to system, Matcher is In, Tags field contains storage. Add filter button present.

Displaying the raw data from your custom telemetry collector shows just the raw data, so it may be difficult to conclude whether it signifies your network's normal or anomalous state. With Asptra, you are proactively notified when any anomaly is detected

Performance Analytics

An IBA probe functions as an analytics pipeline. All IBA probes have at least one source processor at the start of their pipeline. In our example, we added an Extensible Service Data Collector processor that ingests data from your custom telemetry collector.

You can chain additional processors in the probe to perform additional analytics on the data to provide more meaningful insight into your network’s health. These processors are referred to as analytics processors.

Analytics processors allow you to aggregate and apply logic to your data and define an intended state (or a reference state) to raise anomalies. For example, you might not be interested in instantaneous values of raw telemetry data, but rather in an aggregation or trends.

Analytics processors aggregate information such as calculating average, min/max, standard deviation, and so on. You can then compare the aggregated data against expectations so that you can identify whether the data is inside or outside a specified range, in which case an anomaly is raised. You may also want to check whether this anomaly is sustained for a period of time and exceeds a specific threshold. An anomaly is flagged only when the threshold is exceeded to avoid flagging anomalies for transient or temporary conditions. You can achieve this by configuring a Time_In_State processor.

Table 1 describes the different types of analytics processors.

Table 1: Analytics Processors

Type of Processor

Description

Range processors

Processor names: Range, State, Time_In_State, Match_String

Range processors define reference state and generate anomalies.

Grouping processors

Processor names: Match_Count, Match_perc, Set_Count, Sum, Avg, Min, Max, and Std_Dev

Group processors aggregate and process data before feeding into the range processors. These processors can:

  • Produce a per-device count of protocol states.

  • Produce a sum of counters from multiple devices to represent a total over the fabric.

Multi-input processors

Processor names: Match_Count, Match_perc, Set_Count, Sum, Avg, Min, Max, and Std_Dev

Analytics processors take input from multiple stages. These processors can:

  • Produce a single output data set that is a union of input from multiple stages.

  • Perform a logical comparison between input from multiple stages.

For detailed descriptions of all analytic processors, see Probe Processor (Analytics) in the Juniper Apstra User Guide.

Note:

Multi-input processors are not supported for dynamic data types (dynamic text or dynamic number). These processors are typically used for IBA probes that leverage the custom telemetry collection.

In the next section, we'll configure our BFD example probe to detect and raise anomalies.

Raising Anomalies and Storing Historical Data

Now, we'll configure our example probe to detect and raise anomalies if a BDF session goes down. We'll then store the anomalies in a historical database for reference.
  1. Add a second processor to the probe you created in Create a Probe and then click Add Processor.
  2. Select the Match Count processor and give the processor a descriptive name (for example, Down sessions count).
    The match count processor counts the number of BFD sessions in the Down state and groups the count by device.
  3. Configure the second processor.
    This processor configures the probe pipeline so that data from the previous processor is fed into each other.
    Enter Down in the Reference State field.
    Configuration interface for Down sessions count showing input source in, stage name BFD Status, column value, grouped by system_id, reference state Down, and streaming disabled.
    When you update the probe, the output shows the number of BFD sessions in the Down state by each device.
    Dashboard interface displaying Down sessions count for systems with system ID, total count, down session gauges, last updated time, and dynamic update toggle.
  4. Add a third processor.
    We'll now add a third and final processor. This processor produces anomalies to alert you when there are one or more BFD sessions in the Down state.
  5. Click Add Processor and select the Match Count processor.
    Give the processor a descriptive name (for example, BFD anomaly (down > 0)) and then click Add.
    User interface for adding a processor: Processor Type dropdown with Range selected, Processor Name input with BFD anomaly down greater than 0, Output Stage Name input with BFD anomaly down greater than 0, Add button.
  6. Configure the processor.
    Configuration interface for BFD anomaly detection with input stage set to monitor down sessions count. Anomaly condition is defined as more than or equal to 1, triggering an alert.
    1. Enter the Input Stage – Stage Name and select value for the Column name. In our example, we defined the stage name as Down sessions count.

    2. Set the Anomalous Range to More than equal to and 1.

    3. Click Raise Anomaly.

  7. While still in the probe configuration interface, click Enable Metric Logging and select the output stage for your second processor.
    This action enables historical logging of data.
  8. Click Update the Probe.
    If you have any BFD sessions in the Down state, the probe generates anomalies for the BDF sessions.
  9. Check Enable Streaming in the probe configuration.
    UI for configuring settings in a software app with "Enable Streaming" checkbox. Options for adding key/value pairs. Buttons: Update Probe and Cancel.
  10. Finally, select the Data source: Time Series view to see the history of changes in the data value monitored by this stage.
    Dashboard monitoring BFD status with "Stage: Detect BFD Down" label. Highlights include data source: Time Series, aggregation type: last, time range: Last 1 Hour, and a table showing system IDs, neighbor IPs, anomaly status, and values. Two rows highlighted in red with a mix of true and false values, indicating BFD status changes. Tooltip shows timestamp and duration details.