Known Issues
This section lists the known issues in Juniper Paragon Automation.
Device Life-Cycle Management
-
When you use the Release Router option to release a device from being managed by Paragon Automation, the device might not be released as the service orchestration engine might still be referencing the device that you want to release.
Workaround: Before you use the Release Router option, update the network implementation plan and service instances so that the services no longer use the device you are trying to release. Also, the device should be removed from the resource pool that the services are accessing..
-
If the device onboarding fails, the device onboarding status is displayed as Status is not available in the Devices section of the Network Implementation Plan page (Inventory > Device Onboarding > Network Implementation Plan).
Workaround: For such devices, initiate the outbound SSH connection on the router so that the onboarding workflow restarts.
-
Sometimes, the onboarding workflow might restart for a device that is already onboarded. This is harmless, as the workflow will observe that the node is already onboarded, report the observation, and exit.
Workaround: None.
-
The onboarding of a device fails if you use NETCONF EDIT or NETCONF RPC configuration formats in the configuration templates.
Workaround: NETCONF EDIT and NETCONF RPC configuration formats are not supported in Release 2.1.0. Instead, use the CLI configuration format.
-
Paragon Automation triggers the configuration templates included in a device profiles and interface profile only during the initial onboarding of the device. You cannot use the configuration templates included in the device profiles and interface profiles to apply additional configuration on a device after the device is onboarded.
Workaround: None.
-
On the Inventory page (Inventory > Devices > Network Inventory) page, the device status is displayed as
Connected
even if there is a network disconnection in outbound SSH. Ideally, the device status should be displayed asDisconnected
within 10 minutes. Sometimes, there may be a delay for the changes to be reflected on the Inventory page.Workaround: None.
-
A warning message is not displayed when you try to delete a configuration template that is used in the network implementation plan.
Workaround: None.
-
If you have enabled the Trust option in the device profile, the onboarding workflow may fail with the following error:
Onboarding workflow failed reason: task_failure, Trust score computation failed.
Workaround: Retry the onboarding process after a few minutes.
Observability
-
The Hardware accordion does not display the following information for the listed devices:
-
Chassis temperature (chassis-temperature) for MX204, MX240, MX304, MX10004, MX10008, and MX10016 devices.
-
Charts related to the fan speed (rpm-percent) for MX480, MX960, MX10004, MX10008, and MX10016 devices.
-
Power supply module temperature (psm-temperature) for MX204, MX480, and MX960 devices.
-
Line card charts for some ACX Series and MX Series devices as the flexible PIC concentrator (FPC) fields are not supported on these devices. See Table 1 for more information.
Table 1: Line Card Charts Support Device Family
Device Series
FPC Fields Not Supported
ACX Series
ACX7100-32C, ACX7100-48L, ACX7024, ACX7024X, ACX7509, ACX7348
fpc-temperature, fpc-cpu-utilization, fpc-buffer-memory-utilization
MX Series
MX204, MX240, MX304, MX480, MX960, MX10004, MX10008, MX10016
fpc-temperature, fpc-cpu-utilization
-
-
On the Interfaces accordion, forward error correction (FEC) corrected errors and FEC uncorrected errors charts are available only on interfaces that support speeds equal to or greater than 100-Gbps.
-
After you apply a new configuration for a device, the Active Configuration for Device-Name page (Observability> Troubleshoot Device > Device-Name > Configuration accordion > View active config link) does not display the latest configuration immediately. It takes several minutes for the latest changes to be reflected on the Active Configuration for Device-Name page.
Workaround: You can verify whether the new configurations are applied to the device by logging in to the device using CLI.
-
The graphs related to CRC errors display No Results Found as the CRC errors-related data is not streamed from the devices for the management ports.
Workaround: None.
-
If a device is discovered through a BGP-LS peering session even before you onboard the device, then duplicate LSPs are created when a PCEP session is established with the device. In some rare cases, the duplicate LSPs may remain.
Workaround: If you see duplicate LSPs, restart the EdgeAdapter pod.
-
For PTX10001, PRX10004, and PTX10016 devices, the Linecards graph on Hardware details for Device-Name (Observability > Troubleshoot Devices > Device-Name > Hardware accordion) page does not display any data.
Workaround: None.
-
For PTX Series, MX Series, and ACX Series devices, the RSVP TE Global Errors graph on the RSVP Routing Details for Device-Name (Observability > Troubleshoot Devices > Device-Name > Routing and MPLS accordion) page does not display any data.
Workaround: None.
-
For PTX10001, PRX10004, and PTX10016 devices, the PSM Temperature graph on the Hardware details for Device-Name page (Observability > Troubleshoot Devices > Device-Name > Hardware accordion > PSUs) does not display any data.
Workaround: None.
-
After the primary node is switched off, the OC term and GNMI state on the Remote Management accordion (Observability > Troubleshoot Devices > Device-Name) are displayed as disconnected.
Workaround: You can do one of the following:
-
Offboard and onboard the devices, or
-
Restart the OC-term pod.
-
Service Orchestration
-
The "vpn_svc_type" service type is displayed as "pbb-evpn" instead of "evpn-mpls" on the Paragon Automation GUI and through the REST API.
Workaround: None.
-
The following limitations are seen when you use the service orchestration cMGD CLI to modify the placement-interface information of an L3VPN service:
-
The initial placement-interface options that were populated when the service order was created are not displayed.
-
You can select the interface for the site access from all the interfaces present on the CE or PE device.
-
When you modify the PE topology and the available ports in the topology, you must:
Delete the existing placement-interface and placement-options from the site network access by using either REST API or the service orchestration cMGD CLI.
Execute the
request service order modify
command to regenerate the service order with the modified values for the placement-options.
-
-
Sometimes, the apply insights configuration (appy_insights-config) fails if you try to provision a service without properly deleting a previously provisioned service or a device.
For example, if you release the router without off-boarding or deleting a service, then the apply insights configuration fails when the same service or device is used in another organization.
Workaround:
-
If there are stale services and devices, run the following REST APIs from the cMGD container of the foghorn namespace to delete stale services and devices, and rerun the workflow:
-
curl --request DELETE <http://config-server.healthbot:9000/api/v2/config/services/device-group/<device-group> name>/
-
curl --request DELETE <http://config-server.healthbot:9000/api/v2/config/device-group/<device-group> name>/
-
curl --request POST http://config-server.healthbot:9000/api/v2/config/configuration/
-
-
If there are stale network-groups, run the following REST APIs from the cMGD container of the foghorn namespace to delete the stale network-groups, and rerun the workflow:
-
curl --request DELETE <http://config-server.healthbot:9000/api/v2/config/services/device-group/<network-group> name>/
-
curl --request DELETE <http://config-server.healthbot:9000/api/v2/config/device-group/<network-group> name>/
-
curl --request POST http://config-server.healthbot:9000/api/v2/config/configuration/
-
-
-
If you modify an L3VPN service, the historical data for Monitors related to the modified L3VPN service are deleted.
Workaround: None.
-
The history of service orders that are generated for a service instance is saved in the Order History Tab for auditing purposes. However, when you delete the service instance, the service order history gets deleted.
Workaround: None.
-
There is no synchronization between accessing and configuring devices. A workflow might fail if a device is configured by more than one service order.
Workaround: If two different service orders are likely to affect the same device, we recommend that you wait until the first service order is executed before you publish the next service order.
-
While configuring an EVPN service order, the GUI does not throw a validation error even if you specify a value that is equal to 1 Tbps for CBS and CIR fields.
Workaround: Based on your topology, ensure that you specify the right values for CBS and CIR fields.
-
The EVPN service order creation fails if you try to create an EVPN service order by importing an existing JavaScript Object Notation (JSON) file.
Workaround: If you are using a JSON file, ensure that you clear the placement section before you publish the service order.
-
Sometimes, the publishing of a service order fails due to existing placements.
Workaround: In such cases, you can export the failed service order to JSON format and then create a new service order or modify an existing service order by importing this JSON file. During the importing process, discard the placements and then publish the service order.
-
For some devices such as ACX7204, if you configure VLANs on unused ports, the following error occurs:
VLAN must be specified on tagged interfaces.
Workaround: This issue is caused by the default factory configuration on the port. Delete the default factory configuration on the ports that you plan to use.
-
For an MX 240 device, the OSPF-related data is not populated on the Passive Assurance tab (Orchestration > Instances > Service-Order-Name Details).
Workaround: Configure OSPF on the customer edge (CE) device.
-
Although multiple VLAN IDs are available in the topology resources, the Placement section of the EVPN service order lists only one VLAN ID in the drop down.
Workaround: To fix this issue:
Edit the EVPN service order to add new VLAN IDs. You can add the VLAN IDs under the Tagged Interface section.
Clear the Placement section by deselecting the device name.
Save and publish the service order.
-
While modifying a service order, you cannot clear the existing placements.
Workaround: If publishing the service order fails due to existing placements, you can export the failed service order to JSON format and then create a new service order or modify an existing service order by importing this JSON file. During the importing process, delete the placements and then publish the service order.
-
While creating or modifying an EVPN service order, you cannot configure multiple VLAN IDs on the Aggregated Ethernet (AE) interface. The EVPN considers the AE port as a single resource and therefore an AE interface cannot be reused across service instances even when the VLAN IDs on the AE IFL differ.
Workaround: None.
-
When you edit a topology resource instance, the POP page may not list the latest sites or nodes.
Workaround: Refresh the Resource Instances page before you start the edit operation.
-
The publishing of an EVPN service order fails if you modify an existing EVPN service order to add a new site. This issue occurs only when you modify an existing EVPN using the GUI.
Workaround: If you want to add a new site to an existing EVPN service order:
Export the service order and save the service order in JSON format on your local system.
Edit the service order and import the service order from your local system.
Save and publish the service order.
-
All access circuits must have the same VLAN configured, failing which the service may not function as desired.
Workaround: None.
-
Scheduling provisioning of service orders is a Beta feature in Release 2.1.0. Except in fresh installations, scheduling may not work consistently.
Workaround: None.
-
The VPN Node Id field on the VPN Node page (Service Orchestration > Instances > Add L2 Circuit) field may not list all the onboarded devices.
Workaround: Refresh the Instances page before you add the L2 circuit service instance.
-
Even though the EVPN service order is successfully created and the configurations are pushed to the devices, the following error message is displayed and the communication between the customer Edge (CE) devices fails:
Status: Configuration Error
This issue occurs in a multi-homing scenario (multiple provider edge (PE) devices are connected to the same customer edge device) with single-active or all-active redundancy mode.
Workaround: Configure a unique LAG index for each site.
-
VLAN drop down under Placement section doesn't display all the available VLANs as per the topology service order and it shows only the selected VLAN.
Workaround: None.
-
While creating an EVPN site, if the value that you have specified for the Minimum number of links field is greater than the number of Member Links, then the EVPN service order fails.
Workaround: Ensure that the value you specify for the Minimum number of links field is less than or equal to the number of Member Links configured for the LAG interface.
-
While modifying a resource instance, if you update VLAN with a value higher than the current specified value then the Modify Resource Instance operation fails.
Workaround: None.
-
If you try to modify an existing resource instance that does not have a device or a site, then the Inventory section of the Modify Resource-Instance-Name page does not load.
Workaround: Ensure that you have added at least one device and a site to the resource instance.
-
The service order fails with the error message,
Invalid XML document, namespace is missing
.Workaround: On the device with the failed configuration, you should turn off
system services netconf rfc-compliant
andsystem services netconf notification
knobs. -
During device onboarding, pings from the device to Microsoft Azure and Google Cloud Platform endpoints fail.
Workaround: Instead of Microsoft Azure and Google Cloud Platform, use Amazon Web Services (AWS) as the endpoint.
-
While creating an EVPN service order, if the MAC Address Limit that you have specified is out of the defined range then the service order fails.
Workaround: Specify a value that is within the defined range, and then republish the service order.
-
While creating or modifying an EVPN service order, the MAC Address Limit configuration is ignored if you specify the action to be taken as
Drop
when the upper limit for customer MAC addresses exceed.Workaround: None.
-
When you modify a resource instance, Link Aggregation Control Protocol (LACP)-related information is not displayed.
Workaround: Use the REST API to upload the LACP-related resource. See Help > API Docs in the Paragon Automation GUI for information about Paragon Automation REST APIs.
Active Assurance
-
The REST API, api-aggregator, does not capture alerts that are related to Tests on the Connectivity accordion (Observability > Troubleshooting Devices > Device-Name).
Workaround: None.
-
On the Tests page (Observability > Active Assurance), the Test summary that is displayed on the info card is not based on the time range that you have selected. Instead, the Test summary is based on the number of Tests listed on a specific page.
Workaround: None.
-
If you modify an existing Monitor and then restart the Monitor, the events that are raised before modifying the Monitor are not cleared.
Workaround: None.
-
The status of a Test Agent is shown as offline after the device's Routing Engine switches over from the primary Routing Engine to the backup Routing Engine, or vice versa.
Workaround: Reinstall Test Agent after the Routing Engine switchover.
Administration
-
The latest audit log messages may not be displayed on the Audit Logs page.
Workaround: Restart the audits-delivery stream deployment. To restart the audits-delivery stream deployment:
Log in to a Paragon Automation cluster node.
Run the following commands:
kubectl -n streams scale --replicas=0 deployment audits-delivery
kubectl -n streams scale --replicas=1 deployment audits-delivery
Check whether the latest audit logs are listed on the Audit Logs page. If you don’t see the latest audit logs, you may have to repeat this procedure.
Installation
-
The backup and restore functionality has the following caveats:
-
You cannot restore backed up data from a Release 2.0.0 setup to a Release 2.1.0 setup.
- You must restore data on the same setup from which the backup was taken and not on a fresh installation.
-
Paragon Automation backs up only application configurations such as devices, sites, service orders, and so on. Since a backup does not store the certificates and infrastructure services configurations, that information must be kept unchanged during restoration.
-
Resources allocated to the network won’t be preserved after a restore and you must ensure that you release the allocated resources during the window between taking a backup and performing a restore.
Workaround: None.
-
-
If the PCE Server VIP address is not configured,
kube-proxy
is set to a random port.Workaround: Configure the PCE Server VIP address.
-
If a node in the cluster is not operational, the status of the vector pod from the node that is not operational is displayed as Running, even though the node status is reported as Not Ready. This is due to an existing Kubernetes issue. See https://github.com/kubernetes/kubernetes/issues/117769.
Workaround: You can do one of the following:
-
Monitor the metric, kube_daemonset_status_number_ready. When the value for this metric drops to three, you can manually check from which vector the data is missing.
-
Set a query and an alert for the kube_daemonset_status_number_ready metric in Grafana.
-
-
You might encounter RKE2-related issues if you change the hostname after you set up a cluster.
We recommend that you do not change the hostname after a cluster is set up.
Workaround: None.
-
When the worker node is down, there might be issues if you create an organization or onboard a device.
Workaround: Do not create an organization or onboard a device when a worker node is down. You must wait until the cluster recovers and then create an organization or onboard a device. Recovered state is when all the pods are either in Running or Pending state and are not in any intermediate states like Terminating, CrashloopbackOff, and so on.
-
If there are multiple node failures, the OpenSearch database may fail to start and rejoin the cluster.
Workaround: Manually restart the OpenSearch database by running the
kubectl rollout restart sts -n common opensearch-cluster-master
command.After all three OpenSearch instances are restarted, monitor the log of each OpenSearch instance to see if the pod logs do not have any obvious error message. In rare cases, you may need to restart OpenSearch multiple times.
-
If you have powered off one of the primary nodes, you may not be able to log in to the Juniper Paragon Automation GUI.
Workaround: Restart papi-ws using the following Paragon Shell CLI command:
request paragon cluster pods reset service papi-ws namespace papi operation restart