Telemetry and Monitoring
AI cluster networks demand lossless, high-throughput, and low-latency connectivity. A key component of maintaining performance is the collection and analysis of operational data to monitor congestion, system health, and traffic patterns. Junos OS telemetry enables detailed tracking of critical performance indicators, including thresholds, counters, and congestion metrics specific to AI workloads. Once collected, this data must be analyzed, structured, and visualized to support monitoring, decision-making, and continuous network optimization.
The following sections describe how to configure the devices to enable data collection and outline key performance metrics recommended for the AI IP fabric solution.
Configuring QFX switches to Provide Telemetry Information
To implement telemetry collection the switches need to be configure to allow gPRC-based access as described in the OpenConfig and gRPC for Junos Telemetry Interface section of Junos Telemetry Interface User Guide.
The following configuration was used on all the leaf and spine node devices for this purpose:
user@spine1> show configuration system services extension-service
grpc {
ssl {
port 32767;
local-certificate aos_grpc;
}
routing-instance mgmt_junos;
}
}
| Command | Description |
|---|---|
extension-service request-response
grpc
|
Enables the gRPC interface under the extension service framework, used for APIs like Junos Telemetry Interface (JTI) or third-party integrations. The client issues a request and waits for a response from the Junos OS server. |
ssl port 32767
|
Configures TCP port 32767 for communication using SSL encryption. |
local-certificate aos_grpc
|
Configures authentication using a certificate named aos_grpc to secure the gRPC session. Follow the steps described in Configure gRPC Services to generate and install the necessary certificates. |
routing-instance mgmt_junos
|
Binds the gRPC server to the mgmt_junos routing-instance, meaning it only listens on the out-of-band management interface. |
To validate connectivity between the telemetry collector use the
show system connections
jnpr@stripe2-leaf1> show system connections | match "Address|32767" Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp6 0 0 :::32767 :::* LISTEN 11937/jsd tcp6 0 0 10.161.33.38:32767 10.100.1.17:56634 ESTABLISHED 11937/jsd tcp6 0 0 10.161.33.38:32767 10.100.1.20:53184 ESTABLISHED 11937/jsd tcp6 0 0 10.161.33.38:32767 10.100.1.20:53170 ESTABLISHED 11937/jsd
The sample output shows connections from two collectors (10.100.1.17 and 10.100.1.20).
To confirm that the collectors are actively pulling data via gRPC/gNMI and see what
sensors are in use. Use the show network-agent statistics command:
jnpr@stripe2-leaf1> show network-agent statistics
Subscription Details :
Subscription ID : 1
Type : juniper
Client IP : ipv6:::ffff:10.100.1.17:56634
Subscription Time (UTC) : Thu May 1 12:38:57 2025
Sensor Statistics :
Sensor Path : /network-instances/network-instance/mac-table/entries/entry/
Reporting Interval : 0
Component(s) : l2aldTM,l2ald
Child Sensor Statistics :
Path : /network-instances/network-instance/mac-table/entries/entry/
Component : l2ald
Component-ID : 65535
Path : /network-instances/network-instance/mac-table/entries/entry/
Component : l2aldTM
Component-ID : 65535
Subscription Details :
Subscription ID : 2
Type : gnmi
Client IP : ipv6:::ffff:10.100.1.17:56634
GNMI mode : STREAM
Subscription Time (UTC) : Thu May 1 12:38:57 2025
Sensor Statistics :
Sensor Path : /interfaces/interface/state/admin-status/
Reporting Interval : 120
Component(s) : re0/mib2d
GNMI Sub Mode : SAMPLE
Component ID : 65535
Sensor Path : /interfaces/interface/state/oper-status/
Reporting Interval : 120
Component(s) : re0/mib2d,evo-pfemand
GNMI Sub Mode : SAMPLE
Child Sensor Statistics :
Path : /interfaces/interface/state/oper-status/
Component : evo-pfemand
GNMI-SubMode : SAMPLE
Component-ID : 0
Path : /interfaces/interface/state/oper-status/
Component : re0/mib2d
GNMI-SubMode : SAMPLE
Component-ID : 65535
Sensor Statistics :
Sensor Path : /interfaces/interface/subinterfaces/subinterface/state/admin-status/
Reporting Interval : 120
Component(s) : re0/mib2d
GNMI Sub Mode : SAMPLE
Component ID : 65535
Sensor Path : /interfaces/interface/subinterfaces/subinterface/state/oper-status/
Reporting Interval : 120
Component(s) : re0/mib2d,evo-pfemand
GNMI Sub Mode : SAMPLE
Child Sensor Statistics :
Path : /interfaces/interface/subinterfaces/subinterface/state/oper-status/
Component : evo-pfemand
GNMI-SubMode : SAMPLE
Component-ID : 0
Path : /interfaces/interface/subinterfaces/subinterface/state/oper-status/
Component : re0/mib2d
GNMI-SubMode : SAMPLE
Component-ID : 65535
Subscription Details :
Subscription ID : 3
Type : juniper
Client IP : ipv6:::ffff:10.100.1.17:56634
Subscription Time (UTC) : Thu May 1 12:39:01 2025
Sensor Statistics :
Sensor Path : /junos/system/linecard/qmon-sw/
Reporting Interval : 5
Component(s) : evo-pfemand
Component ID : 0
Subscription Details :
Subscription ID : 4
Type : gnmi
Client IP : ipv6:::ffff:10.161.38.48:39588
GNMI mode : STREAM
Subscription Time (UTC) : Thu May 1 12:39:15 2025
Sensor Statistics :
Sensor Path : /components/component/cpu/utilization/
Reporting Interval : 2
Component(s) : re0/ehmd
GNMI Sub Mode : SAMPLE
Component ID : 65535
Subscription Details :
Subscription ID : 5
Type : juniper
Client IP : ipv6:::ffff:10.161.38.48:57182
Subscription Time (UTC) : Thu May 1 12:39:04 2025
Sensor Statistics :
Sensor Path : /junos/system/linecard/npu/memory/
Reporting Interval : 2
Component(s) : evo-pfemand
Component ID : 0
Subscription Details :
Subscription ID : 6
Type : juniper
Client IP : ipv6:::ffff:10.161.38.48:57182
Subscription Time (UTC) : Thu May 1 12:39:04 2025
Sensor Statistics :
Sensor Path : /junos/system/linecard/interface/
Reporting Interval : 2
Component(s) : picd,evo-pfemand
Child Sensor Statistics :
Path : /junos/system/linecard/interface/
Component : evo-pfemand
Component-ID : 0
Path : /junos/system/linecard/interface/
Component : picd
Component-ID : 0
Subscription Details :
Subscription ID : 7
Type : juniper
Client IP : ipv6:::ffff:10.161.38.48:57182
Subscription Time (UTC) : Thu May 1 12:39:04 2025
Sensor Statistics :
Sensor Path : /junos/system/linecard/qmon-sw/
Reporting Interval : 2
Component(s) : evo-pfemand
Component ID : 0
Subscription Details :
Subscription ID : 8
Type : juniper
Client IP : ipv6:::ffff:10.161.38.48:57182
Subscription Time (UTC) : Thu May 1 12:39:04 2025
Sensor Statistics :
Sensor Path : /junos/system/linecard/interface/queue/
Reporting Interval : 2
Component(s) : Not available
Component ID : 65535
jnpr@stripe1-leaf1> ...scription-paths /interfaces/interface/state/oper-status/ detail
Subscription Details :
Subscription ID : 2
Type : gnmi
Client IP : ipv6:::ffff:10.161.53.17:56132
GNMI mode : STREAM
Subscription Time (UTC) : Thu May 1 14:49:53 2025
Sensor Statistics :
Sensor Path : /interfaces/interface/state/oper-status/
Reporting Interval : 120
Component(s) : re0/mib2d,evo-pfemand
GNMI Sub Mode : SAMPLE
Average iLatency (ms) : 3
Average Circular Buffer Used (%) : 0
Bytes Sent : 2328768
Packets Sent : 8939
Drops : 0
Initial Sync Bytes Sent : 40679
Initial Sync Packets Sent : 165
Initial Sync Drops : 0
Initial Sync Average iLatency (ms) : 4
Initial Sync Average Circular Buffer Used (%) : 0
Child Sensor Statistics :
Path : /interfaces/interface/state/oper-status/
Component : evo-pfemand
GNMI-SubMode : SAMPLE
Component-ID : 0
Average iLatency (ms) : 2
Bytes Sent : 1087006
Packets Sent : 4187
Drops : 0
Initial Sync Bytes Sent : 19165
Initial Sync Packets Sent : 78
Initial Sync Drops : 0
Initial Sync Average iLatency (ms) : 2
Path : /interfaces/interface/state/oper-status/
Component : re0/mib2d
GNMI-SubMode : SAMPLE
Component-ID : 65535
Average iLatency (ms) : 5
Bytes Sent : 1241762
Packets Sent : 4752
Drops : 0
Initial Sync Bytes Sent : 21514
Initial Sync Packets Sent : 87
Initial Sync Drops : 0
Initial Sync Average iLatency (ms) : 5
To confirm the status of sensors you can use show agents sensors:
jnpr@stripe1-leaf1> show agent sensors
Sensor Information :
Name : sensor_1000
Resource : /network-instances/network-instance/mac-table/entries/entry/
Version : 1.0
Sensor-id : 562949953421313
Subscription-ID : 1000
Component(s) : re0/l2ald-agent
Profile Information :
Name : export_1000
Reporting-interval : 0
Payload-size : 5000
Address : 0.0.0.0
Port : 1000
Timestamp : ntp
Format : GPB
Sensor Information :
Name : sensor_1001
Resource : /interfaces/interface/state/admin-status/
Version : 1.0
Sensor-id : 562949953421443
Subscription-ID : 1001
Component(s) : re0/mib2d
Profile Information :
Name : export_1001
Reporting-interval : 120
Payload-size : 5000
Address : 0.0.0.0
Port : 1000
Timestamp : ntp
Format : JSON
Sensor Information :
Name : sensor_1002
Resource : /interfaces/interface/state/oper-status/
Version : 1.0
Sensor-id : 562949953421314
Subscription-ID : 1002
Component(s) : re0/evoaft-jvisiond-brcm,re0/mib2d
Profile Information :
Name : export_1002
Reporting-interval : 120
Payload-size : 5000
Address : 0.0.0.0
Port : 1000
Timestamp : ntp
Format : JSON
Sensor Information :
Name : sensor_1003
Resource : /interfaces/interface/subinterfaces/subinterface/state/admin-status/
Version : 1.0
Sensor-id : 562949953421444
Subscription-ID : 1003
Component(s) : re0/mib2d
Profile Information :
Name : export_1003
Reporting-interval : 120
Payload-size : 5000
Address : 0.0.0.0
Port : 1000
Timestamp : ntp
Format : JSON
Sensor Information :
Name : sensor_1004
Resource : /interfaces/interface/subinterfaces/subinterface/state/oper-status/
Version : 1.0
Sensor-id : 562949953421316
Subscription-ID : 1004
Component(s) : re0/evoaft-jvisiond-brcm,re0/mib2d
Profile Information :
Name : export_1004
Reporting-interval : 120
Payload-size : 5000
Address : 0.0.0.0
Port : 1000
Timestamp : ntp
Format : JSON
Sensor Information :
Name : sensor_1005
Resource : /components/component/cpu/utilization/
Version : 1.0
Sensor-id : 562949953421450
Subscription-ID : 1005
Component(s) : re0/ehmd
Profile Information :
Name : export_1005
Reporting-interval : 2
Payload-size : 5000
Address : 0.0.0.0
Port : 1000
Timestamp : ntp
Format : GPB
Sensor Information :
Name : sensor_1006
Resource : /junos/system/linecard/npu/memory/
Version : 1.0
Sensor-id : 562949953421449
Subscription-ID : 1006
Component(s) : re0/evoaft-jvisiond-brcm
Profile Information :
Name : export_1006
Reporting-interval : 2
Payload-size : 5000
Address : 0.0.0.0
Port : 1000
Timestamp : ntp
Format : GPB
Sensor Information :
Name : sensor_1007
Resource : /junos/system/linecard/qmon-sw/
Version : 1.0
Sensor-id : 562949953421452
Subscription-ID : 1007
Component(s) : re0/evoaft-jvisiond-brcm
Profile Information :
Name : export_1007
Reporting-interval : 2
Payload-size : 5000
Address : 0.0.0.0
Port : 1000
Timestamp : ntp
Format : GPB
Sensor Information :
Name : sensor_1008
Resource : /interfaces/interface/state/
Version : 1.0
Sensor-id : 562949953421451
Subscription-ID : 1008
Component(s) : re0/evoaft-jvisiond-brcm,re0/mgmt-ethd,re0/mib2d
Profile Information :
Name : export_1008
Reporting-interval : 2
Payload-size : 5000
Address : 0.0.0.0
Port : 1000
Timestamp : ntp
Format : GPB
Sensor Information :
Name : sensor_1009
Resource : /junos/system/linecard/qmon-sw/
Version : 1.0
Sensor-id : 562949953421427
Subscription-ID : 1009
Component(s) : re0/evoaft-jvisiond-brcm
Profile Information :
Name : export_1009
Reporting-interval : 5
Payload-size : 5000
Address : 0.0.0.0
Port : 1000
Timestamp : ntp
Format : GPB
Sensor Information :
Name : sensor_1011
Resource : /lldp/state/enabled/
Version : 1.0
Sensor-id : 562949953421493
Subscription-ID : 1011
Component(s) : re0/l2cpd-agent
Profile Information :
Name : export_1011
Reporting-interval : 30
Payload-size : 5000
Address : 0.0.0.0
Port : 1000
Timestamp : ntp
Format : JSON
Recommended KPIs to Monitor
| KPI | JUNOS COMMAND | SENSOR |
|---|---|---|
| Interface State | show interfaces <interface> terse | /interfaces/interface[name=<interface>]/state/oper-status /interfaces/interface[name=<interface>]/state/admin-status |
| Interface Description | show interfaces <interface> extensive | match Description | /interfaces/interface[name=<interface>]/state/description |
| Interface MTU | show interfaces <interface> extensive | match MTU | /interfaces/interface[name=<interface>]/state/mtu |
| Interface speed | show interfaces <interface> extensive | match speed | /interfaces/interface[name=<interface>]/state/high-speed |
| Interface input Drops | show interfaces <interface> extensive | find "Input errors" | /interfaces/interface[name=<interface>]/state/counters/in-discards |
| Interface output Drops | show interfaces <interface> extensive | find "Output errors" | /interfaces/interface[name=<interface>]/state/counters/out-discards |
| Interface output Pkts | run show interfaces <interface> extensive | match "Total Packets" | /interfaces/interface[name=<interface>]/state/counters/out-pkts |
| Interface output unicast Pkts | run show interfaces <interface> extensive | match Unicast | /interfaces/interface[name=<interface>]/state/counters/out-unicast-pkts |
| Interface input Pkts | run show interfaces <interface> extensive | match "Total Packets" | /interfaces/interface[name=<interface>]/state/counters/in-pkts |
| Interface input unicast Pkts | run show interfaces <interface> extensive | match Unicast | /interfaces/interface[name=<interface>]/state/counters/in-unicast-pkts |
| Per interface ECN marked packets | show interfaces <interface> extensive | match ecn | /state/interfaces/interface[name=<interface>/counters/errors/out-ecn-ce-marked-pkts/junos/system/linecard/qmon-sw/ /cos/interfaces/interface/queues/queue/ecnMarkedPkts |
|
Per interface per queue buffer-occupancy |
show interfaces queue buffer-occupancy <interface> | /junos/system/linecard/qmon-sw/ /cos/interfaces/interface/queues/queue/peakBufferOccupancyPercent /cos/interfaces/interface/queues/queue/peakBufferOccupancy |
|
Per Interface, Per forwarding class (queue) Tail Drops |
show interfaces queue <interface> forwarding-class <forwarding-class> | match "Tail" | /junos/system/linecard/qmon-sw/ /cos/interfaces/interface/queues/queue/tailDropPkts |
| Per Interface PFC Pause frames | show interfaces <interface> extensive | math "Priority : <priority>" |
/interfaces/interface[name=<interface-name>]/ethernet/state/counters/in-pause-pkts /interfaces/interface[name=<interface-name>]/ethernet/state/counters/out-pause-pkts |
| IPv6 BGP advertised routes |
show route advertised-routes protocol bgp <neighbor-address> extensive <neighbor-address> = auto discovered link local address of directly connected EBGP neighbor |
/network-instances/network-instance/protocols/protocol/bgp/rib/afi-safis/afi-safi/ipv6-unicast/neighbors/neighbor/adj-rib-out-pre/routes/ |
| IPv6 BGP Received routes |
show route received-routes protocol bgp <neighbor-address> extensive <neighbor-address> = auto discovered link local address of directly connected EBGP neighbor |
/network-instances/network-instance/protocols/protocol/bgp/rib/afi-safis/afi-safi/ipv6-unicast/neighbors/neighbor/adj-rib-in-pre/routes/ /network-instances/network-instance/protocols/protocol/bgp/rib/afi-safis/afi-safi/ipv6-unicast/neighbors/neighbor/adj-rib-in-post/routes/ |
Refer to Network Configuration Example: AI/ML - Telemetry Reference Guide for more details.
Known Limitations:
The following issue has been reported for 23.4X100-D41.2-EVO:
- Pre-FEC BER telemetry data is not supported on the QFX 5240-64OD switch.