NorthStar Controller Troubleshooting Guide
This document includes strategies for identifying whether an apparent problem stems from the NorthStar Controller or from the router, and provides troubleshooting techniques for those problems that are identified as stemming from the NorthStar Controller.
Before you begin any troubleshooting investigation, confirm that all system processes are up and running. A sample list of processes is shown below. Your actual list of processes could be different.
[root@user-PCS ~]# supervisorctl status
collector:es_publisher RUNNING pid 2557, uptime 0:02:18 collector:task_scheduler RUNNING pid 2558, uptime 0:02:18 collector:worker1 RUNNING pid 404, uptime 0:07:00 collector:worker2 RUNNING pid 406, uptime 0:07:00 collector:worker3 RUNNING pid 405, uptime 0:07:00 collector:worker4 RUNNING pid 407, uptime 0:07:00 infra:cassandra RUNNING pid 402, uptime 0:07:01 infra:ha_agent RUNNING pid 1437, uptime 0:05:44 infra:healthmonitor RUNNING pid 1806, uptime 0:04:26 infra:license_monitor RUNNING pid 399, uptime 0:07:01 infra:prunedb RUNNING pid 395, uptime 0:07:01 infra:rabbitmq RUNNING pid 397, uptime 0:07:01 infra:redis_server RUNNING pid 401, uptime 0:07:01 infra:web RUNNING pid 2556, uptime 0:02:18 infra:zookeeper RUNNING pid 396, uptime 0:07:01 listener1:listener1_00 RUNNING pid 1902, uptime 0:04:15 netconf:netconfd RUNNING pid 2555, uptime 0:02:18 northstar:mladapter RUNNING pid 2551, uptime 0:02:18 northstar:npat RUNNING pid 2552, uptime 0:02:18 northstar:pceserver RUNNING pid 1755, uptime 0:04:29 northstar:scheduler RUNNING pid 2553, uptime 0:02:18 northstar:toposerver RUNNING pid 2554, uptime 0:02:18 northstar_pcs:PCServer RUNNING pid 2549, uptime 0:02:18 northstar_pcs:PCViewer RUNNING pid 2548, uptime 0:02:18 northstar_pcs:configServer RUNNING pid 2550, uptime 0:02:18
Restart any processes that display as STOPPED instead of RUNNING.
To stop, start, or restart all processes, use the service northstar stop, service northstar start, and service northstar restart commands.
To access system process status information from the NorthStar Controller Web UI, navigate to More Options>Administration and select System Health.
The current CPU %, memory usage, virtual memory usage, and other statistics for each system process are displayed. Figure 1 shows an example.
Only processes that are running are included in this display.

Table 1 describes each field displayed in the Process Status table.
Table 1: Descriptions of Process Status Fields
Field | Description |
---|---|
Process | The name of the NorthStar Controller process. |
PID | The Process ID number. |
User | The NorthStar Controller user permissions required to access information about this process. |
Group | NorthStar Controller user group permissions required to access information about this process. |
CPU% | Displays current percentage of CPU currently in use by this process. |
Memory | Displays current percentage of memory currently in use by this process. |
Virtual Memory | Displays current Virtual memory in use by this process. |
CPU Time | The amount of time the CPU was used for processing instructions for the process |
CMD | Displays the specific command options for the system process. |
The troubleshooting information is presented in the following sections:
NorthStar Controller Log Files
Throughout your troubleshooting efforts, it can be helpful to view various NorthStar Controller log files. To access log files:
- Log in to the NorthStar Controller Web UI.
- Navigate to More Options > Administration and select Logs.
A list of NorthStar system log and message files is displayed, a truncated example of which is shown in Figure 2.
Figure 2: Sample of System Log and Message Files - Click the log file or message file that you want to view.
The log file contents are displayed in a pop-up window.
- To open the file in a separate browser window or tab, click View Raw Log in the pop-up window.
- To close the pop-up window and return to the list of log and message files, click X in the upper right corner of the pop-up window.
Table 2 lists the NorthStar Controller log files most commonly used to identify and troubleshoot issues with the PCS and PCE.
Table 2: Top NorthStar Controller Troubleshooting Log Files
Log File | Description | Location |
---|---|---|
| Log entries related to the PCEP server. The PCEP server maintains the PCEP session. The log contains information about communication between the PCC and the PCE in both directions. To configure verbose PCEP server logging:
|
|
| Log entries related to the PCS. The PCS is responsible for path computation. This log includes events received by the PCS from the Toposerver, including provisioning orders. It also contains notification of communication errors and issues that prevent the PCS from starting up properly. |
|
| Log entries related to the topology server. The topology server is responsible for maintaining the topology. These logs contain the record of the events between the PCS and the Toposerver, the Toposerver and NTAD, and the Toposerver and the PCE server |
|
Table 3 lists additional
log files that can also be helpful for troubleshooting. All of the
log files in Table 3 are located
under the /opt/northstar/logs
directory.
Table 3: Additional Log Files for Troubleshooting NorthStar Controller
Log Files | Description |
| Log events related to the cassandra database. |
| HA coordinator log. |
| Interface to transport controller log. |
| Configuration script log. |
| Log events related to nodejs. |
| Log files related to communication between the PCC and the PCE in both directions. |
| Log files related to the PCS, which includes any event received by PCS from Toposerver and any event from Toposerver to PCS including provisioning orders. This log also contains any communication errors as well as any issues that prevent the PCS from starting up properly. |
| Logs files of REST API requests. |
| Log files related to the topology server. Contains the record of the events between the PCS and topology server, the topology server and NTAD, and the topology server and the PCE server Note:
Any message forwarded to the |
To see logs related to the Junos VM, you must establish a telnet session to the router. The default IP address for the Junos VM is 172.16.16.2. The Junos VM is responsible for maintaining the necessary BGP, ISIS, or OSPF sessions.
Empty Topology
Figure 3 illustrates the flow of information from the router to the Toposerver that results in the topology display in the NorthStar Controller UI. When the topology display is empty, it is likely this flow has been interrupted. Finding out where the flow was interrupted can guide your problem resolution process.
The topology originates at the routers. For NorthStar Controller to receive the topology, there must be a BGP-LS, ISIS, or OSPF session from one of the routers in the network to the Junos VM. There must also be an established Network Topology Abstractor Daemon (NTAD) session between the Junos VM and the Toposerver.
To check these connections:
- Using the NorthStar Controller CLI, verify that the NTAD
connection between the Toposerver and the Junos VM was successfully
established as shown in this example:
[root@northstar ~]# netstat -na | grep :450
tcp 0 0 172.16.16.1:55752 172.16.16.2:450 ESTABLISHED
Note Port 450 is the port used for Junos VM to Toposerver connections.
In the following example, the NTAD connection has not been established:
[root@northstar ~]# netstat -na | grep :450
tcp 0 0 172.16.16.1:55752 172.16.16.2:450 LISTENING
- Log in to the Junos VM to confirm whether NTAD is configured
to enable topology export. The grep command below gives you the IP
address of the Junos VM.
[root@northstar ~]# grep "ntad_host" /opt/northstar/data/northstar.cfg
ntad_host=172.16.16.2
[root@northstar ~]# telnet 172.16.16.2
Trying 172.16.16.2... Connected to 172.16.16.2. Escape character is '^]'. northstar_junosvm (ttyp0)
login: northstar
Password:
--- JUNOS 14.2R4.9 built 2015-08-25 21:01:39 UTC This JunOS VM is running in non-persistent mode. If you make any changes on this JunOS VM, Please make sure you save to the Host using net_setup.py utility, otherwise the config will be lost if this VM is restarted.
northstar@northstar_junosvm> show configuration protocols | display set
set protocols topology-export
If the
topology-export
statement is missing, the Junos VM cannot export data to the Toposerver. - Use Junos OS
show
commands to confirm whether the BGP, ISIS, or OSPF relationship between the Junos VM and the router is ACTIVE. If the session is not ACTIVE, the topology information cannot be sent to the Junos VM. - On the Junos VM, verify whether the lsdist.0 routing table
has any entries:
northstar@northstar_junosvm> show route table lsdist.0 terse | match lsdist.0
lsdist.0: 54 destinations, 54 routes (54 active, 0 holddown, 0 hidden)
If you see only zeros in the lsdist.0 routing table, there is no topology that can be sent. Review the NorthStar Controller Getting Started Guide sections on configuring topology acquisition.
- Ensure that there is at least one link in the lsdist.0
routing table. The Toposerver can only generate an initial topology
if it receives at least one NTAD link event. A network that consists
of a single node with no IGP adjacency with other nodes (as is possible
in a lab environment, for example), will not enable the Toposerver
to generate a topology. Figure 4 illustrates the Toposerver’s logic process for creating the
initial topology.
Figure 4: Logic Process for Initial Topology Creation If an initial topology cannot be created for this reason, the toposerver.log generates an entry similar to the following example:
Dec 9 16:03:57.788514 fe-cluster-03 TopoServer Did not send the topology because no links were found.
Incorrect Topology
One important function of the Toposerver is to correlate the unidirectional link (interface) information from the routers into bidirectional links by matching source and destination IPv4 Link_Identifiers from NTAD link events. When the topology displayed in the NorthStar UI does not appear to be correct, it can be helpful to understand how the Toposerver handles the generation and maintenance of the bidirectional links.
Generation and maintenance of bidirectional links is a complex process, but here are some key points:
For the two nodes constituting each bidirectional link, the Node ID that was assigned first (and therefore has the lower Node ID number) is given the Node A designation, and the other node is given the Node Z designation.
Note The Node ID is assigned when the Toposerver first receives the Node event from NTAD.
Whenever a Node ID is cleared and reassigned (such as during a Toposerver restart or network model reset), the Node IDs and therefore, the A and Z designations, can change.
The Toposerver receives a Link Update message when a link in the network is added or modified.
The Toposerver receives a Link Withdraw message when a link is removed from the network.
The Link Update and Link Withdraw messages affect the operational status of the nodes.
The node operational status, together with the protocol (IGP versus IGP plus MPLS) determine whether a link can be used to route LSPs. For a link to be used to route LSPs, it must have both an operational status of UP and the MPLS protocol active.
Missing LSPs
When your topology is displaying correctly, but you have missing LSPs, take a look at the flow of information from the PCC to the Toposerver that results in tunnels being added to the NorthStar Controller UI, as illustrated in Figure 5. The flow begins with the configuration at the PCC, from which an LSP Update message is passed to the PCEP server by way of a PCEP session and then to the Toposerver by way of an Advanced Message Queuing Protocol (AMQP) connection.
To check these connections:
- Look at the toposerver.log. The log prints a message every
15 seconds when it detects that its connection with the PCEP server
has been lost or was never successfully established. Note that in
the following example, the connection between the Toposerver and the
PCEP server is marked as down.
Toposerver log: Apr 22 16:21:35.016721 user-PCS TopoServer Warning, did not receive the PCE beacon within 15 seconds, marking it as down. Last up: Fri Apr 22 16:21:05 2016 Apr 22 16:21:35.016901 user-PCS TopoServer [->PCS] PCE Down: Warning, did not receive the PCE beacon within 15 seconds, marking it as down. Last up: Fri Apr 22 16:21:05 2016 Apr 22 16:21:50.030592 user-PCS TopoServer Warning, did not receive the PCE beacon within 15 seconds, marking it as down. Last up: Fri Apr 22 16:21:05 2016 Apr 22 16:21:50.031268 user-PCS TopoServer [->PCS] PCE Down: Warning, did not receive the PCE beacon within 15 seconds, marking it as down. Last up: Fri Apr 22 16:21:05 2016
- Using the NorthStar Controller CLI, verify that the PCEP
session between the PCC and the PCEP server was successfully established
as shown in this example:
[root@northstar ~]# netstat -na | grep :4189
tcp 0 0 0.0.0.0:4189 0.0.0.0:* LISTEN tcp 0 0 172.25.152.42:4189 172.25.155.50:59143 ESTABLISHED tcp 0 0 172.25.152.42:4189 172.25.155.48:65083 ESTABLISHED
Note Port 4189 is the port used for PCC to PCEP server connections.
Knowing that the session has been established is useful, but it does not necessarily mean that any data was transferred.
- Verify whether the PCEP server learned about any LSPs
from the PCC.
[root@user-PCS ~]# pcep_cli
# show lsp all list
2016-04-22 17:09:39.696061(19661)[DEBUG]: pcc_lsp_table.begin: 2016-04-22 17:09:39.696101(19661)[DEBUG]: pcc-id:1033771436/172.25.158.61, state: 0 2016-04-22 17:09:39.696112(19661)[DEBUG]: START of LSP-NAME-TABLE … 2016-04-22 17:09:39.705358(19661)[DEBUG]: Summary pcc_lsp_table: 2016-04-22 17:09:39.705366(19661)[DEBUG]: Summary LSP name tabl: 2016-04-22 17:09:39.705375(19661)[DEBUG]: client_id:1033771436/172.25.158.61, state:0,num LSPs:13 2016-04-22 17:09:39.705388(19661)[DEBUG]: client_id:1100880300/172.25.158.65, state:0,num LSPs:6 2016-04-22 17:09:39.705399(19661)[DEBUG]: client_id:1117657516/172.25.158.66, state:0,num LSPs:23 2016-04-22 17:09:39.705410(19661)[DEBUG]: client_id:1134434732/172.25.158.67, state:0,num LSPs:4 2016-04-22 17:09:39.705420(19661)[DEBUG]: Summary LSP id table: 2016-04-22 17:09:39.705429(19661)[DEBUG]: client_id:1033771436/172.25.158.61, state:0, num LSPs:13 2016-04-22 17:09:39.705440(19661)[DEBUG]: client_id:1100880300/172.25.158.65, state:0, num LSPs:6 2016-04-22 17:09:39.705451(19661)[DEBUG]: client_id:1117657516/172.25.158.66, state:0, num LSPs:23 2016-04-22 17:09:39.705461(19661)[DEBUG]: client_id:1134434732/172.25.158.67, state:0, num LSPs:4
In the far right column of the output, you see the number of LSPs that were learned. If this number is 0, no LSP information was sent to the PCEP server. In that case, check the configuration on the PCC side, as described in the NorthStar Controller Getting Started Guide.
PCC That is Not PCEP-Enabled
The Toposerver associates the PCEP sessions with the nodes in the topology from the TED in order to make a node PCEP-enabled. This Toposerver function is hindered if the IP address used by the PCC to establish the PCEP session was not the one automatically learned by the Toposerver from the TED. For example, if a PCEP session is established using the management IP address, the Toposerver will not receive that IP address from the TED.
When the PCC successfully establishes a PCEP session, it sends a PCC_SYNC_COMPLETE message to the Toposerver. This message indicates to NorthStar that synchronization is complete. The following is a sample of the corresponding toposerver log entries, showing both the PCC_SYNC_COMPLETE message and the PCEP IP address that NorthStar might or might not recognize:
Dec 9 17:12:11.610225 fe-cluster-03 TopoServer NSTopo::updateNode (PCCNodeEvent) ip: 172.25.155.26 pcc_ip: 172.25.155.26 evt_type: PCC_SYNC_COMPLETE Dec 9 17:12:11.610230 fe-cluster-03 TopoServer Adding PCEP flag to pcep_ip: 172.25.155.26 node_id: 0880.0000.0026 router_ID: 88.0.0.26 protocols: 4 Dec 9 17:12:11.610232 fe-cluster-03 TopoServer Setting live pcep_ip: 172.25.155.26 for router_ID: 88.0.0.26
Some options for correcting the problem of an unrecognized IP address are:
Manually input the unrecognized IP address in the device profile in the NorthStar Web UI by navigating to More Options > Administration > Device Profile.
Ensure there is at least one LSP originating on the router, which will allow Toposerver to associate the PCEP session with the node in the TED database.
Once the IP address problem is resolved, and the Toposerver is able to successfully associate the PCEP session with the node in the topology, it adds the PCEP IP address to the node attributes as can be seen in the PCS log:
Dec 9 17:12:11.611392 fe-cluster-03 PCServer [<-TopoServer] routing_key = ns_node_update_key Dec 9 17:12:11.611394 fe-cluster-03 PCServer [<-TopoServer] NODE UPDATE(Live): ID=0880.0000.0026 protocols=(20)ISIS2,PCEP status=UNKNOWN hostname=skynet_26 router_ID=88.0.0.26 iso=0880.0000.0026 isis_area=490001 AS=41 mgmt_ip=172.25.155.26 source=NTAD Hostname=skynet_26 pcep_ip=172.25.155.26
LSP Stuck in PENDING or PCC_PENDING State
Once nodes are correctly established as PCEP-enabled, you could start provisioning LSPs. It is possible for the LSP controller status to indicate PENDING or PCC_PENDING as seen in the Tunnels tab of the Web UI network information table (Controller Status column). This section explains how to interpret those statuses.
When an LSP is being provisioned, the PCS server computes a path that satisfies all the requirements for the LSP, and then sends a provisioning order to the PCEP server. Log messages similar to the following example appear in the PCS log while this process is taking place:
Apr Apr 25 10:06:44.798336 user-PCS PCServer [->TopoServer] push lsp configlet, action=ADD Apr 25 10:06:44.798341 user-PCS PCServer {#012"lsps":[#012{"request-id":928380025,"name":"JTAC","from":"11.0.0.102", "to":"11.0.0.104","pcc":"172.25.158.66","bandwidth":"100000","metric":0,"local-protection":false,"type":"primary","association-group-id":0,"path-attributes":{"admin-group":{"exclude":0,"include-all":0, "include-any":0},"setup-priority":7,"reservation-priority":7,"ero":[{"ipv4-address":"11.102.105.2"},{"ipv4-address":"11.105.107.2"}, {"ipv4-address":"11.114.117.1"}]}}#012]#012} Apr 25 10:06:44.802500 user-PCS PCServer provisioning order sent, status = SUCCESS Apr 25 10:06:44.802519 user-PCS PCServer [->TopoServer] Save LSP action, id=928380025 event=Provisioning Order(ADD) sent request_id=928380025 Apr 25 10:06:44.802534 user-PCS PCServer lsp action=ADD JTAC@11.0.0.102 path= controller_state=PENDING
The LSP controller status is PENDING at this point, meaning that the provisioning order has been sent to the PCEP server, but an acknowledgement has not yet been received. If an LSP is stuck at PENDING, it suggests that the problem lies with the PCEP server. You can log into the PCEP server and configure verbose log messages which can provide additional information of possible troubleshooting value:
pcep_cli
set log-level all
There are also a variety of show commands on the PCEP server that can display useful information. Just as with Junos OS syntax, you can enter show ? to see the show command options.
If the PCEP server successfully receives the provisioning order, it performs two actions:
It forwards the order to the PCC.
It sends an acknowledgement back to the PCS.
The PCEP server log would show an entry similar to the following example:
2016-04-25 10:06:45.196263(27897)[EVENT]: 172.25.158.66:JTAC UPD RCVD FROM PCC, ack 928380025 2016-04-25 10:06:45.196517(27897)[EVENT]: 172.25.158.66:JTAC ADD SENT TO PCS 928380025, UP
The LSP controller status changes to PCC_PENDING, indicating that the PCEP server received the provisioning order and forwarded it on to the PCC, but the PCC has not yet responded. If an LSP is stuck at PCC_PENDING, it suggests that the problem lies with the PCC.
If the PCC receives the provisioning order successfully, it sends a response to the PCEP server, which in turn, forwards the response to the PCS. When the PCS receives this response, it clears the LSP controller status completely, indicating that the LSP is fully provisioned and is not waiting for action from the PCEP server or PCC. The operational status (Op Status column) then becomes the indicator for the condition of the tunnel.
The PCS log would show an entry similar to the following example:
Apr 25 10:06:45.203909 user-PCS PCServer [<-TopoServer] JTAC@11.0.0.102, LSP event=(0)CREATE request_id=928380025 tunnel_id=9513 lsp_id=1 report_type=ACK
LSP That is Not Active
If an LSP provisioning order is successfully sent and acknowledged, and the controller status is cleared, it is still possible that the LSP is not up and running. If the operational status of the LSP is DOWN, the PCC cannot signal the LSP. This section explores some of the possible reasons for the LSP operational status to be DOWN.
Utilization is a key concept related to LSPs that are stuck in DOWN. There are two types of utilization, and they can be different from each other at any specific time:
Live utilization—This type is used by the routers in the network to signal an LSP path. This type of utilization is learned from the TED by way of NTAD. You might see PCS log entries such as those in the following example. In particular, note the reservable bandwidth (reservable_bw) entries that advertise the RSVP utilization on the link:
Apr 25 10:10:11.475686 user-PCS PCServer [<-TopoServer] LINK UPDATE: ID=L11.105.107.1_11.105.107.2 status=UP nodeA=0110.0000.0105 nodeZ=0110.0000.0107 protocols=(260)ISIS2,MPLS Apr 25 10:10:11.475690 user-PCS PCServer [A->Z] ID=L11.105.107.1_11.105.107.2 IP address=11.105.107.1 bw=10000000000 max_rsvp_bw=10000000000 te_metric=10 color=0 reservable_bw={9599699968 8599699456 7599699456 7599699456 7599699456 7599699456 7599699456 7099599360 } Apr 25 10:10:11.475694 user-PCS PCServer [Z->A] ID=L11.105.107.1_11.105.107.2 IP address=11.105.107.2 bw=10000000000 max_rsvp_bw=10000000000 te_metric=10 color=0 reservable_bw={10000000000 10000000000 10000000000 8999999488 7899999232 7899999232 7899999232 7899999232 }
Planned utilization—This type is used within NorthStar Controller for path computation. This utilization is learned from PCEP when the router advertises the LSP and communicates to NorthStar the LSP bandwidth and the path the LSP is to use. You might see PCS log entries such as those in the following example. In particular, note the bandwidth (bw) and record route object (RRO) entries that advertise the RSVP utilization on the link:
Apr 25 10:06:45.208021 feffendy-PCS PCServer [<-TopoServer] routing_key = ns_lsp_link_key Apr 25 10:06:45.208034 feffendy-PCS PCServer [<-TopoServer] JTAC@11.0.0.102, LSP event=(2)UPDATE request_id=0 tunnel_id=9513 lsp_id=1 report_type=STATE_CHANGE Apr 25 10:06:45.208039 feffendy-PCS PCServer JTAC@11.0.0.102, lsp add/update event lsp_state=ACTIVE admin_state=UP, delegated=true Apr 25 10:06:45.208042 feffendy-PCS PCServer from=11.0.0.102 to=11.0.0.104 Apr 25 10:06:45.208046 feffendy-PCS PCServer primary path Apr 25 10:06:45.208049 feffendy-PCS PCServer association.group_id=128 association_type=1 Apr 25 10:06:45.208052 feffendy-PCS PCServer priority=7/7 bw=100000 metric=30 Apr 25 10:06:45.208056 feffendy-PCS PCServer admin group bits exclude=0 include_any=0 include_all=0 Apr 25 10:06:45.208059 feffendy-PCS PCServer PCE initiated Apr 25 10:06:45.208062 feffendy-PCS PCServer ERO=0110.0000.0102--11.102.105.2--11.105.107.2--11.114.117.1 Apr 25 10:06:45.208065 feffendy-PCS PCServer RRO=0110.0000.0102--11.102.105.2--11.105.107.2--11.114.117.1 Apr 25 10:06:45.208068 feffendy-PCS PCServer samepath, state changed
It is possible for the two utilizations to be different enough from each other that it causes interference with successful computation or signalling of the path. For example, if the planned utilization is higher than the live utilization, a path computation issue could arise in which the PCS cannot compute the path because it thinks there is no room for it. But because the planned utilization is higher than the actual live utilization, there may very well be room.
It’s also possible for the planned utilization to be lower than the live utilization. In that case, the PCC does not signal the path because it thinks there is no room for it.
To view utilization in the Web UI topology map, navigate to Options in the left pane of the Topology view. If you select RSVP Live Utilization, the topology map reflects the live utilization that comes from the routers. If you select RSVP Utilization, the topology map reflects the planned utilization which is computed by the NorthStar Controller based on planned properties.
A better troubleshooting tool in the Web UI is the Network Model Audit widget in the Dashboard view. The Link RSVP Utilization line item reflects whether there are any mismatches between the live and the planned utilizations. If there are, you can try executing Sync Network Model from the Web UI by navigating to Administration > System Settings, and then clicking Advanced Settings in the upper right corner of the resulting window.
The upper right corner button toggles between General Settings and Advanced Settings.
Disappearing Changes
Two options are available in the Web UI for synchronizing the topology with the live network. These options are only available to the system administrator, and can be accessed by first navigating to Administration > System Settings, and then clicking Advanced Settings in the upper right corner of the resulting window.
The upper right corner button toggles between General Settings and Advanced Settings.
Figure 6 shows the two options that are displayed.
It is important to be aware that if you execute Reset Network Model in the Web UI, you will lose changes that you’ve made to the database. In a multi-user environment, one user might reset the network model without the knowledge of the other users. When a reset is requested, the request goes from the PCS server to the Toposerver, and the PCS log reflects:
Apr 25 10:54:50.385008 user-PCS PCServer [->TopoServer] Request topology reset
The Toposerver log then reflects that database elements are being removed:
Apr 25 10:54:50.386912 user-PCS TopoServer Truncating pcs.links... Apr 25 10:54:50.469722 user-PCS TopoServer Truncating pcs.nodes... Apr 25 10:54:50.517501 user-PCS TopoServer Truncating pcs.lsps... Apr 25 10:54:50.753705 user-PCS TopoServer Truncating pcs.interfaces... Apr 25 10:54:50.806737 user-PCS TopoServer Truncating pcs.facilities...
The Toposerver then requests a synchronization with both the Junos VM to retrieve the topology nodes and links, and with the PCEP server to retrieve the LSPs. In this way, the Toposerver relearns the topology, but any user updates are missing. Figure 7 illustrates the flow from the topology reset request to the request for synchronization with the Junos VM and the PCEP Server.
Upon receipt of the synchronization requests, Junos VM and the PCEP server return topology updates that reflect the current live network. The PCS log shows this information being added to the database:
Apr 25 10:54:52.237882 user-PCS PCServer [<-TopoServer] Update Topology Apr 25 10:54:52.237894 user-PCS PCServer [<-TopoServer] Update Topology Persisted Nodes (0) Apr 25 10:54:52.238957 user-PCS PCServer [<-TopoServer] Update Topology Live Nodes (7) Apr 25 10:54:52.242336 user-PCS PCServer [<-TopoServer] Update Topology Persisted Links (0) Apr 25 10:54:52.242372 user-PCS PCServer [<-TopoServer] Update Topology live Links (10) Apr 25 10:54:52.242556 user-PCS PCServer [<-TopoServer] Update Topology Persisted Facilities (1) Apr 25 10:54:52.242674 user-PCS PCServer [<-TopoServer] Update Topology Persisted LSPs (0) Apr 25 10:54:52.279716 user-PCS PCServer [<-TopoServer] Update Topology Live LSPs (47) Apr 25 10:54:52.279765 user-PCS PCServer [<-TopoServer] Update Topology Finished
Figure 8 illustrates the return of topology updates from the Junos VM and the PCEP Server to the Toposerver and the PCS.
You should use the Reset Network Model when you want to start over from scratch with your topology, but if you don’t want to lose user planning data when synchronizing with the live network, execute the Sync Network Model operation instead. With this operation, the PCS still requests a topology synchronization, but the Toposerver does not delete the existing elements. Figure 9 illustrates the flow from the PCS to the Junos VM and PCEP server, and the updates coming back to the Toposerver.
Investigating Client Side Issues
If you are looking for the source of a problem, and you cannot find it on the server side of the system, there is a debugging flag that can help you find it on the client side. The flag enables detailed messages on the web browser console about what has been exchanged between the server and the client. For example, you might notice that an update is not reflected in the Web UI. Using these detailed messages, you can identify possible miscommunication between the server and the client such as the server not actually sending the update, for example.
To enable this debug flag, modify the URL you use to launch the Web UI as follows:
https://server_address:8443/client/app.html?debug=true
If you are already in the Web UI, it is not necessary to log
out; simply add ?debug=true
to the URL
and press Enter. The UI reloads.
Figure 10 shows an example of the web browser console with detailed debugging messages.
Accessing the console varies by browser. Figure 11 shows an example: accessing the console on Google Chrome.
Collecting NorthStar Controller Debug Files
If you are unable to resolve a problem with the NorthStar Controller,
we recommend that you forward the debug files generated by the NorthStar
Controller debugging utility to JTAC for evaluation. Currently all
debug files are located in subdirectories under the u/wandl/tmp
directory.
To collect debug files, log in to the NorthStar Controller CLI,
and execute the command u/wandl/bin/system-diagnostic.sh filename
.
The output is generated and is available from the /tmp
directory in the filename.tbz2
debug file.