Contrail Insights Network Device Monitoring Common Issues
JTI Timestamp is Off in Contrail Insights Chart
There is an issue with the timestamp of JTI data not synchronizing with user's current timestamp. As a result, JTI data is shown as ahead or behind in the Contrail Insights charts.
To solve the JTI timestamp not synchronizing with user's timestamp:
Use Network Time Protocol (NTP) to sync Junos device time. Verify the result of
show system uptime
command is the same as the time of AppFormix-VM.The JTI stream comes directy from the virtual Forwarding Engine (vFPC) with vFPC timestamp, and the vFPC/vCP has separate NTP service. You should force NTP sync between vFPC – vCP and remove local time failover.
[root@vfpc]# vi /etc/ntp.conf server 128.0.0.1 iburst ####### 128.0.0.1 is vCP internal IP #server 127.127.0.1 #### comment out LOCAL HARDWARE CLOCK
Then run:
[root@vfpc]# service ntpd stop && service ntpd start
Run the following command to check that offset is back to normal:
run ntpq -p
JTI Device Not Showing Data in the Chart
For troubleshooting information about JTI device not showing data in chart, see Contrail Insights JTI (UDP) Monitoring.
SNMP Device Not Reporting Data
There are several reasons why SNMP devices are not reporting data including:
Device reachability.
MIBs not getting installed.
Contrail Insights plug-ins not distributing the device data to the correct Contrail Insights collector.
To correct device reachability or MIBs not getting installed:
Log in to your
appformix_network_agents
nodes. If you have multiple hosts in this aggregate, verify in all these hosts.Run
cd /opt/appformix/manager/tailwind_manager/
.Run the plug-in files directly from this folder. If some specific MIBs are not working (for example, the
plugin_config_file
for that MIB isconfig_file.py
), run following command:python check_snmp_network_device_template.py -d {ip} -f config_file -c {snmp_community} -v 2c
The command can be changed due to different SNMP version.
Run the following command to check the possible variables in the script:
python check_snmp_network_device_template.py -h
To check the configuration file name of a plug-in, get information from the JSON file of that plug-in in the
certified_plugins
folder of the Ansible installer.
To correct Contrail Insights plug-ins not distributing the device data to the correct collector:
Use the Contrail Insights plug-in API to get the distribution map of any SNMP plug-ins. It is located in Plugin > Config > ObjectList. For more information, contact mailto:AppFormix-Support@juniper.net with your specific case. You can also view data from the Dashboard by selecting Settings > Plugins, then select a specific plug-in to view enabled metrics.
gRPC Devices Not Reporting Data
There are several reasons why gRPC devices are not reporting data including:
Device is not installed correctly with the openconfig/network-agent package.
Device is not configured correctly.
appformix_network_agents
cannot receive data from devices.
To correct device not installed correctly with openconfig package, network-agent package, or device not configured correctly:
Log in to your device to verify if it has the correct packages and configuration.
Run
show version
on the device to check the device module and Junos version.Run
show system software | grep na
to check if the network_agent package is correctly installed on the device.Run
show system software | grep open
to check if the openconfig package is correctly installed on the device.Run
show system services extension-service
to check the gRPC configuration on the device. Following is an example of the desired output:request-response { grpc { clear-text { port 50051; } skip-authentication; } } notification { allow-clients { address 0.0.0.0/0; } }
To correct appformix_network_agents
not receiving data from devices:
Verify that you do not have any firewall IPtables preventing the connections. Run the following commands to flush the IPtables rules:
iptables -F iptables -P FORWARD ACCEPT iptables -P OUTPUT ACCEPT iptables -P INPUT ACCEPT
(Optional) Use the Contrail Insights plug-in API to get the distribution map of any SNMP plug-ins. It is located in Plugin > Config > ObjectList. For more information, contact mailto:AppFormix-Support@juniper.net with your specific case.
Run the gRPC test script from appformix_network_agents to check if Contrail Insights can get gRPC data from devices. Contrail Insights supplies a test script in the
/opt/appformix/manager/tailwind_manager
folder namedcheck_grpc_device_test.py
. Run the following commands to debug:cd /opt/appformix/manager/tailwind_manager source ../venv/bin/activate python check_grpc_device_test.py -ip {device_ip} -port {port} -sensor {sensor_path}
If you can get data from
check_grpc_device_test.py
, you are able to get data from the Contrail Insights software.If you cannot get data from
check_grpc_device_test.py
, you can enable the gRPC logs on the device by running the following commands:set system services extension-service traceoptions file extension-service.log set system services extension-service traceoptions file size 5m set system services extension-service traceoptions file files 2 set system services extension-service traceoptions flag all
To get the gRPC logs, run the command:
show log extension-service.log
JTI Data Not Delivered to Application Socket Due to rp_filter
In some cases, UDP packets from devices are received by interfaces
(based on tcpdump output) but cannot be received to application socket.
When you run socket.recvfrom
in Python
code, you cannot receive any data on port 42596.
To correct this issue, disable rp_filter
on the eth1
interface (which is the interface
device sends data to) by running the following commands:
echo 0 > /proc/sys/net/ipv4/conf/all/rp_filter echo 0 > /proc/sys/net/ipv4/conf/eth1/rp_filter
Now you should see data in the Contrail Insights Dashboard.
SNMP Traps Not Shown in Dashboard
For troubleshooting why SNMP traps are not showing in the Dashboard, perform the following steps to determine if anything is incorrect:
Check if port 42597 is open and listening in all
appformix_controller
nodes by runningnetstat -plan|grep 42597
.Confirm the
snmp_trap_network_device
plug-in is present in the cluster. Select Settings > Plugins.Check if the alarm named
network_device_snmp_trap
is present in the cluster from the Dashboard Alarms page.Verify the SNMP trap configurations on the devices are correct. See SNMP Traps in Contrail Insights for complete configuration details.
Check if all Contrail Insights Platform nodes are reporting data. You can confirm this if you see data in the host charts for the Platform nodes. Select Dashboard > Hosts tab, then select the host node to view more detail.
If you identify issues with any of the above, there are a few things to try. Check if the problem is fixed after each step since all steps might not be needed:
Re-run the playbook to add the plug-in and the alarm again (Step 2 and Step 3).
Verify and update the SNMP trap configuration on the devices (Step 4).
Lastly, restart the Contrail Insights Agent on the Platform Nodes.