Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

 
 

Supported KPIs in Observability

Key Performance Indicators (KPIs) are metrics used to monitor and evaluate the health, performance, and quality of your network.

Starting Release 2.8.0, Routing Director uses dial-in gNMI connection in the Sample subscription mode to collect KPIs like interface status and usage, routing protocols states, route counts, system CPU, memory usage, and so on for analysis and visualization. In earlier releases, the dial-out gNMI connection was used to collect KPIs. KPIs help you to proactively identify issues, ensure service quality, and maintain optimal network operations. Following are the types of KPIs that are collected by Routing Director:

  • Default KPIs—These are system-generated KPIs. These KPIs are automatically monitored and analyzed by Routing Director to assess device and network health. Routing Director collect these KPIs based on predefined rules.

  • Custom KPIs—These are user-defined KPIs. You can use Routing Director to define and monitor KPIs tailored to your specific network needs. Routing Director collect these KPIs based on custom rules.

For Routing Director to establish a dial‑in connection for collecting device telemetry, the device must meet at least one of the following conditions:

  • The device must be managed by a network implementation plan with the Observability use case selected.

  • An OpenConfig Custom KPI rule must be instantiated on the device.

Routing Director opens the gNMI dial-in connection to the device on port 32767 for collecting device telemetry. When you upgrade from an earlier release to Routing Director Release 2.8.0, all the dial-out connections change to dial-in connections without any data loss automatically.

Note:
  • Dial-in gNMI connections are not supported on devices running Junos OS and Junos OS Evolved versions 24.2R1, 24.2R2, and 24.4R1. This limitation occurs because gNMI connections fail when certificate verification is enabled on devices running these versions.

    As a result, starting with Juniper Routing Director Release 2.8.0, certificate verification is disabled by default on Juniper devices.

    For all other Junos OS and Junos OS Evolved versions, enable client certificate validation for the gNMI connection by executing the following curl command:

  • You must ensure that the firewall rules of your network allow collections from all JRD Controller node IP addresses to devices over port 32767.

Routing Director generates alerts based on the anomalies in the KPIs (both system-generated and custom KPIs) in your network. You can view a graphical representation of the performance and alerts generated for all KPIs associated with a device.

Routing Director uses Junos Telemetry to collect KPIs for Junos Network device. Here are the details on the frequency at which the data is collected for the following data models (sensor models):

  • OpenConfig—60 seconds
  • NETCONF—180 seconds

Table 1 describes the list of KPIs that are supported in the Observability use case.

Table 1: Supported KPIs in Observability
Domain Rule Name Sensor Name Sensor Type Field Name Field Count (Total number of fields evaluated per rule) Number of metrics stored in TSDB per rule
bfd check-bfd-session-state bfd-sensor iAgent remote-state, session-neighbor, session-state 3 4
vpn check-evpn-instance-state evpn iAgent evpn-instance-name, evpn-interface-mode, evpn-interface-name, evpn-interface-status 4 4
vpn check-evpn-neighbor evpnNeighbors iAgent evpn-instance-name, evpn-num-neighbors 2 3
vpn evpn-check-mac-count-netconf evpn-mac-count iAgent instance-name, learn-vlan, mac-count, threshold 4 5
chassis check-chassis-power-fan-temperature chassis-netconf iAgent class, comment, name, status, temperature 5 3
chassis check-psm-temperature psm-temperature-netconf iAgent psm, psm-temperature-high-threshold, psm-temperature-low-threshold, temperature 4 5
chassis check-fan-state-rpm fan-netconf iAgent critical-rpm-threshold, fan-name, high-rpm-threshold, high-threshold, low-threshold, measurement, measurements, rpm-high-threshold, rpm-low-threshold, rpm-percent, rpm-percent-anomaly, rpm-percents, status 13 12
chassis check-psm-power-usage-state components-oc open-config power-dc-output, psm, psm-power-capacity-maximum, psm-power-usage, psm-power-usage-anomaly, psm-power-usage-high-threshold, psm-power-usage-low-threshold, psm-state, psm-temperature, psm-temperature-degrees, psm-temperature-high-threshold, psm-temperature-low-threshold 12 12
chassis check-routing-engine-temperature components-oc open-config high-threshold, low-threshold, routing-engine, routing-engine-cpu-temperature, routing-engine-cpu-temperature-anomaly, routing-engine-temperature, routing-engine-temperature-anomaly 7 12
chassis check-system-power-usage-temp-state components-oc open-config chassis, chassis-temperature, high-threshold, low-threshold, power-system-maximum, power-system-remaining, system-power-remaining-in-percentage, system-power-usage-high-threshold, system-power-usage-low-threshold 9 8
fpc check-fpc-cpu-memory-state-temp components-oc open-config cpu-high-threshold, cpu-low-threshold, fpc, fpc-cpu-utilization, fpc-cpu-utilization-anomaly, fpc-memory-buffer, fpc-memory-buffer-anomaly, fpc-memory-heap, fpc-utilization-idle, memory-high-threshold, memory-low-threshold, state, temp-high-threshold, temp-low-threshold, temperature, temperature-anomaly 16 22
fpc check-pfe-discards pfe-sensor-netconf iAgent bad-route-discard, bad-route-discard-rate, bits-to-test-discard, bits-to-test-discard-rate, data-error-discard, data-error-discard-rate, drop-threshold, fabric-discard, fabric-discard-rate, info-cell-discard, info-cell-discard-rate, invalid-iif-discard, invalid-iif-discard-rate, nexthop-discard, nexthop-discard-rate, stack-overflow-discard, stack-overflow-discard-rate, stack-underflow-discard, stack-underflow-discard-rate, tcp-header-error-discard, tcp-header-error-discard-rate 21 23
system check-ntp-synchronization-status ntp-status iAgent clock-jitter, offset, peer, precision, reference-id, reference-time, root-delay, root-dispersion, status-info, stratum 10 5
system check-system-cpu-memory components-oc open-config re-cpu-utilization, re-cpu-utilization-anomaly, re-cpu-utilization-high-threshold, re-cpu-utilization-low-threshold, re-memory-buffer, re-memory-buffer-anomaly, re-memory-buffer-high-threshold, re-memory-buffer-low-threshold, routing-engine 9 14
interface check-physical-interface-traffic ifd   egress-stats-if-bps, egress-stats-if-octets, egress-stats-if-pkts, egress-stats-if-pps, elapsed-time, if-name, ingress-stats-if-bps, ingress-stats-if-octets, ingress-stats-if-pkts, ingress-stats-if-pps, stats_received_count 11 3
interface check-ifl-state interfaces-oc open-config high-threshold, ifl-oper-status, in-bandwidth, in-mbps, in-octets, in-util, interface-name, low-threshold, out-bandwidth, out-mbps, out-octets, out-util, sub-interface-index 13 9
interface check-interface-fec-crc-framing-errors errorinfo-netconf iAgent drop-threshold, fec-uncorrected, framing-errors, input-crc-errors, interface-name, optical-fec-corrected, output-crc-errors 7 7
interface check-interface-in-out-errors-traffic-state-flaps interfaces-oc open-config admin-state, flaps, flaps-threshold, high-threshold, in-errors-count, in-errors-threshold, in-mbps, in-mbps-anomaly, in-octets, in-util, interface-name, link-state, low-threshold, out-errors-count, out-errors-threshold, out-mbps, out-mbps-anomaly, out-octets, out-util, speed 20 23
interface check-optical-signal-loss-fec-tx-rx-power optical-sensor-oc open-config fec-uncorrected, interface-name, lane-index, optics-current, optics-rx-power, optics-rx-power-anomaly, optics-tx-power, optics-tx-power-anomaly, rx-high-alarm-threshold, rx-high-warning-threshold, rx-loss-of-signal-alarm, rx-low-alarm-threshold, rx-low-warning-threshold, tx-high-alarm-threshold, tx-high-warning-threshold, tx-laser-disabled-alarm, tx-loss-of-signal-functionality-alarm, tx-low-alarm-threshold, tx-low-warning-threshold 19 22
interface check-optical-temp-thresholds temperature-thresholds-oc open-config high-alarm-threshold, high-warning-threshold, interface-name, optical-temp, optical-temp-anomaly 5 8
lldp check-lldp-session lldp-sensor iAgent interface-name, lldp-neighbor-count 2 3
oam get-lfm-information link-fault-management-information iAgent lfm-discovery-state, lfm-interface-name, lfm-status 3 4
bgp check-bgp-neighbor-prefixes bgp-netconf iAgent address-family, advertised-route-count-threshold, advertised-routes, instance-name, peer-address, received-route-count-threshold, received-routes 7 6
bgp check-bgp-neighbor-stats bgp-netconf iAgent flap-count, flap-count-threshold, instance-name, peer-address, peer-state 5 5
routes collect-fib-stats fib-sensor iAgent address-family, fib-route-count, route-table-type, table-name, threshold 5 4
isis check-isis-adjacency-status isis-netconf iAgent adjacency-state, interface-name, level, system-name 4 3
isis check-isis-flap-detection isis-netconf iAgent flap-threshold, interface-name, transition-count 3 4
isis check-isis-statistics isis-sensor open-config csnp-drops, esh-drops, iih-drops, interface-name, ish-drops, lsp-drops, psnp-drops, threshold, unknown-drops 9 10
mpls check-te-rsvp-interface-errors lsp open-config authentication-fail, bad-checksum, bad-packet-format, bad-packet-length, bad-packet-version, in-path-error-messages, in-reservation-error-messages, message-out-of-order, out-path-error-messages, out-reservation-error-messages, received-nack, recv-pkt-disabled-intf, send-failure, state-timeout, te-interface, unknown-ack, unknown-nack 17 16
mpls check-te-rsvp-global-errors lsp open-config authentication-fail, bad-checksum, bad-packet-format, bad-packet-length, error-threshold, in-path-error-messages, in-reservation-error-messages, instance-name, out-path-error-messages, out-reservation-error-messages, received-nack, unknown-ack, unknown-nack 13 14
mpls check-ldp-session ldp-oc open-config lsr-id, session-state 2 3
mpls check-lsp-state lsp open-config lsp-name, lsp-state-change-count, oper-status 3 4
mpls check-rsvp-neighbor-state rsvp open-config neighbor-address, neighbor-interface, neighbor-state 3 4
ospf check-ospf-io-statistics ospf-io-statistics-netconf iAgent error-threshold, ospf-error, packets-read 3 4
ospf check-ospf-neighbor-state ospf-neighbor-netconf iAgent dr-address, instance-name, interface-name, neighbor-address, neighbor-id, ospf-neighbor-state 6 3
ospf check-ospf-statistics ospf-statistics-netconf iAgent hello-count-threshold, hello-received, hello-sent, ospf-packet-type 4 5
ospf check-ospf3-io-statistics ospf-io-statistics-netconf iAgent error-threshold, ospf-error 2 4
ospf check-ospf3-neighbor-state ospf-neighbor-netconf iAgent instance-name, interface-name, neighbor-address, neighbor-id, ospf-neighbor-state 5 3
ospf check-ospf3-statistics ospf-statistics-netconf iAgent hello-count-threshold, hello-received, hello-sent, ospf-packet-type 4 5
routes collect-rib-table-protocol-routes route-protocol-summary iAgent active-route-count, protocol-name, protocol-total-route-count, table-name, threshold 5 5
routes collect-rib-table-routes route-summary iAgent active-route-count, destination-count, hidden-route-count, holddown-route-count, table-name, table-total-route-count, threshold 7 5
vpn check-evpn-view   network-rule instance-ifl-no, instance-interface-name, instance-interface-status, pe-router-name, vpn-name, vpn-state 6 8
vpn check-l2circuit-pw-state l2ckt iAgent connection-id, connection-status, neighbor 3 3
vpn check-l3vpn-bgp-state   network-rule instance-ifl-no, instance-interface-name, instance-interface-status, neighbor-session, pe-router-name, vpn-name, vpn-state 7 9
vpn check-l3vpn-ospf-state   network-rule instance-ifl-no, instance-interface-name, instance-interface-status, neighbor-session, pe-router-name, vpn-name, vpn-state 7 9
vpn check-l3vpn-ospf3-state   network-rule instance-ifl-no, instance-interface-name, instance-interface-status, neighbor-session, pe-router-name, vpn-name, vpn-state 7 9
vpn check-l3vpn-static-state   network-rule instance-ifl-no, instance-interface-name, instance-interface-status, neighbor-address, pe-router-name, vpn-name 6 7

Retrieve List of Sensors Streaming Telemetry Data

Use the command, show network-agent statistics gnmi, to get a list of sensors subscribed on the device and that are streaming data to Routing Director. The following is a sample output of the command.