Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

Installing Data Collectors for Analytics

 

Overview

The Analytics functionality streams data from the network devices, via data collectors, to the NorthStar Controller where it is processed, stored, and made available for viewing in the web UI.

Note

See the NorthStar Controller User Guide for information about collecting and viewing telemetry data.

Note

Junos OS Release 15.1F6 or later is required to use Analytics. For hardware requirements for analytics nodes, see NorthStar Controller System Requirements. For supported deployment scenarios, see Platform and Software Compatibility.

If you are not using NorthStar application high availability (HA), you can install a data collector either in the same node where the NorthStar Controller application is installed (single-server deployment) or in one or more external nodes that are dedicated to log collection and storage. In both cases, the supplied install scripts take care of installing the required packages and dependencies.

In a NorthStar application HA environment, you have three options:

  • Configure an external analytics node.

  • Configure an external analytics cluster. An analytics cluster provides backup nodes in the event of an analytics node failure. This cluster can be local (within the same data center) or geo-diverse (analytics geo-HA).

  • Install data collectors in the same nodes that make up the NorthStar cluster. In this scenario, the NorthStar application cluster nodes are also analytics cluster nodes.

The configuration options from the analytics processes are read from the /opt/northstar/data/northstar.cfg file. In a single-server deployment, no special changes are required because the parameters needed to start up the collector are part of the default configuration. For your reference, Table 1 lists some of the settings that the analytics processes read from the file.

Table 1: Some of the Settings Read by Collector Processes

Setting

Description

mq_host

Points to the IP address or virtual IP (VIP) (for multiple NorthStar node deployments) of hosts running the messaging bus service (the NorthStar application node). Defaults to localhost if not present.

mq_username

Username used to connect to the messaging bus. Defaults to northstar.

mq_password_enc

Password used to connect to the messaging bus. There is no default; the service fails to start if this is not configured. On single-server deployments, the password is set during the normal application install process.

mq_port

TCP port number used by the messaging bus. Defaults to 5672.

es_port

TCP port used by Elasticsearch. Defaults to 9200.

es_cluster_name

Used by Elasticsearch in HA scenarios to form a cluster. Nodes in the same cluster must be configured with the same cluster name. Defaults to NorthStar.

Two additional settings are relevant to collector processes, but are not part of northstar.cfg. These parameters configure analytics port numbers and are configurable using the NorthStar CLI:

Note

If you make port number changes, you must restart logstash using supervisorctl restart analytics:logstash for those changes to take effect.

  • rpm-statistics-port

    Port used to read the syslog messages that are generated from the device, containing the results of the RPM stats. The default is 1514. To modify the port, use the NorthStar CLI command set northstar analytics log-collector rpm-statistics-port port-number.

  • jti-port

    UDP port number to which the collector listens for telemetry packets from the devices. The default is 3000. To modify the port, use the NorthStar CLI command set northstar analytics log-collector jti-port port-number.

Note

If you are upgrading NorthStar from a release earlier than NorthStar 4.3.0, and you are using NorthStar analytics, you must upgrade NorthStar manually using the procedure described in Upgrading from Pre-4.3 NorthStar with Analytics.

Note

If you are upgrading NorthStar from a release earlier than NorthStar 6.0.0, you must redeploy the analytics settings after you upgrade the NorthStar application nodes. This is done from the Analytics Data Collector Configuration Settings menu described later in this topic. This is to ensure that netflowd can communicate with cMGD (necessary for the NorthStar CLI).

Analytics Geo-HA

NorthStar Controller supports analytics geo-HA as of release 5.1.0. While original analytics HA was designed for local clusters (same data center), geo-HA makes all data available on all nodes, to better serve networks where the nodes are geographically remote from one another. To achieve this, a local RabbitMQ (messaging bus) is installed on each analytics (ElasticSearch) node. This improves the tolerance for latency and helps compensate for the tendency of remote nodes to become out of sync.

The remote ElasticSearch nodes use JTI logstash to retrieve and process the data from the other ElasticSearch nodes. The replication pipeline creates a named queue on each remote server. The queues are persistent so that if any ElasticSearch node goes down, it can recover by resuming the processing of data pushed onto the remote queue. Figure 1 shows the interactions within a node and between nodes.

Figure 1: Analytics Geo-HA Interactions
Analytics Geo-HA Interactions

The Analytics Collector Configuration Settings menu within the net_setup.py script has an option to prepare and deploy Geo-HA.

Single-Server Deployment–No NorthStar HA

To install the data collector together with the NorthStar application in a single-server deployment (without NorthStar HA), use the following procedure:

  1. On the NorthStar application node, install the NorthStar Controller bundle, using the install.sh script. See the Installing the NorthStar Controller.
  2. On the same node, run the install-analytics.sh script.
    [root@ns ~]# cd /opt/northstar/northstar_bundle_x.x.x/
    [root@ns northstar_bundle_x.x.x]# ./install-analytics.sh
  3. Verify that the three analytics processes are installed and running by executing supervisorctl status on the PC Server:
    [root@ns ~]# supervisorctl status

External Analytics Node(s)–No NorthStar HA

Figure 2 shows a sample configuration with a single NorthStar application node and three analytics nodes comprising an analytics cluster. All the nodes connect to the same Ethernet network, through the eth1 interface. Optionally, you could have a single analytics node rather than creating an analytics cluster. The instructions in this section cover both a single external analytics node and an external analytics cluster.

Figure 2: Analytics Cluster Deployment (No NorthStar HA)
Analytics
Cluster Deployment (No NorthStar HA)

To install one or a cluster of external analytics nodes, use the following procedure:

  1. On the NorthStar application node, install the NorthStar Controller application, using the install.sh script. See Installing the NorthStar Controller.
  2. On each analytics node, install northstar_bundle.rpm, but do not run the install.sh script. Instead, run the install-analytics.sh script. The script installs all required dependencies such as NorthStar-JDK, NorthStar-Python, and so on.​ For NorthStar Analytics1, it would look like this:
    [root@NorthStarAnalytics1]# rpm -Uvh <rpm-filename>
    [root@NorthStarAnalytics1]# cd /opt/northstar/northstar_bundle_x.x.x/
    [root@NorthStarAnalytics1 northstar_bundle_x.x.x]# install-analytics.sh
  3. The next configuration steps require you to run the net_setup.py script to configure the NorthStar node and the analytics nodes(s) so they can connect to each other. But before you do that, we recommend that you copy the public SSH key of the node where the net_setup.py script is to be executed to all other nodes. The net_setup.py script can be run on either the NorthStar application node or one of the analytics nodes to configure all the nodes. This is not a required step, but it saves typing the passwords of all the systems later when the script is deploying the configurations or testing the connectivity to the different nodes.
    [root@NorthStarAnalytics1 network-scripts]# ssh-copy-id root@192.168.10.200

    Try logging into the machine using ssh root@192.168.10.200 and check in with .ssh/authorized_keys.

    Repeat this process for all nodes (192.168.10.100, 192.168.10.200, 192.168.10.201, and 192.168.10.202 in our example).

  4. Run net_setup.py on the NorthStar application node or on one of the analytics nodes. The Main Menu is displayed:
  5. Select G Analytics Data Collector Setting. The Data Collector Configuration Settings menu is displayed.
  6. Select options from the Data Collector Configuration Settings menu to make the following configuration changes:
    • Select 3 to modify the NorthStar application node settings, and configure the NorthStar server name and IP address. For example:

      Please select a number to modify.
      [CR=return to main menu]:
      3
    • Select 4 to modify the analytics node IP address. For example:

      Please select a number to modify.
      [CR=return to main menu]:
      4
    • Select 2 to add additional analytics nodes as needed. In our analytics cluster example, two additional analytics nodes would be added:

      Please select a number to modify.
      [CR=return to main menu]:
      2
      Please select a number to modify.
      [CR=return to main menu]:
      2
    • Select 8A to configure a VIP address for the cluster of analytics nodes. This is required if you have an analytics cluster. If you have a single external analytics node only (not a cluster), you can skip this step.

      Please select a number to modify.
      [CR=return to main menu]:
      8A

      This VIP serves two purposes:

      • It allows the NorthStar server to send queries to a single endpoint. The VIP will be active on one of the analytics nodes, and will switch over in the event of a failure (a full node failure or failure of any of the processes running on the analytics node).

      • Devices can send telemetry data to the VIP, ensuring that if an analytics node fails, the telemetry data can still be processed by whichever non-failing node takes ownership of the VIP.

    The configuration for our analytics cluster example should now look like this:

  7. Select 9 to test connectivity between nodes. This is applicable whenever you have external analytics nodes, whether just one or a cluster of them. For example:
    Please select a number to modify.
    [CR=return to main menu]:
    9
  8. Select A (for a single analytics node), B (for an analytics cluster), or C for analytics geo-HA to configure the node(s) for deployment. Note

    This option restarts the web process in the NorthStar application node.

    For our example, select B:

    Please select a number to modify.
    [CR=return to main menu]:
    B
    YES

    This completes the installation, and telemetry data can now be sent to the analytics nodes via the analytics VIP.

    Note

    If you opt to send telemetry data to an individual node instead of using the VIP of the analytics cluster, and that node goes down, the streams to the node are lost. If you opt to install only one analytics node instead of an analytics cluster that uses a VIP, you run the same risk.

External Analytics Node(s)–With NorthStar HA

Figure 3 shows a sample configuration with a NorthStar HA cluster of three nodes and three analytics nodes comprising an analytics cluster, for a total of six nodes. All the nodes connect to the same Ethernet network, through the eth1 interface. In a NorthStar HA environment, you could also opt to have a single analytics node, for a total of four nodes, but analytics collection would not be protected in the event of analytics node failure.

Figure 3: Analytics Cluster Deployment (With NorthStar HA)
Analytics Cluster
Deployment (With NorthStar HA)

For this scenario, you first configure the NorthStar application HA cluster according to the instructions in Configuring a NorthStar Cluster for High Availability.

Once the NorthStar HA cluster is configured, set up the external analytics cluster. The setup steps for the external analytics cluster are exactly the same as in the previous section, External Analytics Node(s)–No NorthStar HA. Once you complete them, the configuration should look like this:

Test connectivity between nodes by selecting 9 from the menu.

Configure the nodes for deployment by selecting B for HA analytics or C for Geo-HA analytics. This restarts the web process in the NorthStar application node.

Verifying Data Collection When You Have External Analytics Nodes

Verify that data collection is working by checking that all services are running. Only the relevant processes are shown below.

[root@NorthStarAnalytics1 ~]# supervisorctl status

The analytics node(s) should start processing all records from the network, and pushing statistics to the NorthStar node through RabbitMQ. Check the pcs.log in the NorthStar node to see the statistics being pushed to the PC server. For example:

You can also use the REST APIs to get some aggregated statistics. This tests the path from client to nodejs to Elasticsearch.

Replacing a Failed Node in an External Analytics Cluster

On the Data Collector Configuration Settings menu, options D and E can be used when physically replacing a failed node. They allow you to replace a node without having to redeploy the entire cluster.

Caution

While a node is being replaced in a three-node cluster, HA for analytics data is not guaranteed.

  1. Replace the physical node in the network and install northstar_bundle.rpm on the replacement node. In our example, the replacement node is NorthStarAnalytics3.
  2. Run the install-analytics.sh script to install all required dependencies such as NorthStar-JDK, NorthStar-Python, and so on.​ For NorthStarAnalytics3, it would look like this:
    [root@NorthStarAnalytics3]# rpm -Uvh <rpm-filename>
    [root@NorthStarAnalytics3]# cd /opt/northstar/northstar_bundle_x.x.x/
    [root@NorthStarAnalytics3 northstar_bundle_x.x.x]# install-analytics.sh
  3. Set up the SSH key from an anchor node to the replacement node. The anchor node can be a NorthStar application node or one of the analytics cluster nodes (other than the replacement node). Copy the public SSH key from the anchor node to the replacement node, from the replacement node to the other nodes (NorthStar application nodes and analytics cluster nodes), and from the other nodes (NorthStar application nodes and analytics cluster nodes) to the replacement node.

    For example:

    [root@NorthStarAnalytics1 network-scripts]# ssh-copy-id root@192.168.10.202

    Try logging into the machine using ssh root@192.168.10.202 and check in with .ssh/authorized_keys.

  4. Run net_setup.py on the node you selected. The Main Menu is displayed:
  5. Select G Data Collector Setting. The Data Collector Configuration Settings menu is displayed.
  6. Select option 9 to test connectivity to all NorthStar application nodes and analytics cluster nodes.
  7. Select option D to copy the analytics settings to the other nodes.
  8. Select option E to add the replacement node to the cluster. Specify the node ID of the replacement node.
  9. On any analytics cluster node, use the following command to check Elasticsearch cluster status. Verify that the status is “green” and the number of nodes is correct.

Collectors Installed on the NorthStar HA Cluster Nodes

In a NorthStar HA environment, you can achieve failover protection simultaneously for the NorthStar application and for analytics by setting up each node in the NorthStar cluster to also serve as an analytics node. Because nothing is external to the NorthStar cluster, your total number of nodes is the number in the NorthStar cluster (minimum of three). Figure 4 shows this installation scenario.

Figure 4: NorthStar HA Cluster Nodes with Analytics
NorthStar HA Cluster Nodes
with Analytics

To set up this scenario, you first install both the NorthStar application and analytics on each of the standalone nodes, configure the nodes to be an HA cluster, and finally, configure the nodes to be an analytics cluster. Follow these steps:

  1. On each NorthStar application node, install the NorthStar Controller application, using the install.sh script. See the Installing the NorthStar Controller.
  2. On each node, install northstar_bundle.rpm, and run the install-analytics.sh script. The script installs all required dependencies such as NorthStar-JDK, NorthStar-Python, and so on.​ For node ns03 in the example, it would look like this:
    [root@ns03]# rpm -Uvh <rpm-filename>
    [root@ns03]# cd /opt/northstar/northstar_bundle_x.x.x/
    [root@ns03 northstar_bundle_x.x.x]# install-analytics.sh
  3. Use the following command on each node to ensure that the three analytics processes are installed and running:
  4. Follow the instructions in Configuring a NorthStar Cluster for High Availability to configure the nodes for NorthStar HA. This involves running the net_setup.py utility, selecting E to access the HA Setup menu, and completing the HA setup steps using that menu.
  5. From the HA Setup menu, press Enter to return to the main net_setup.py menu. The Main Menu is displayed:
  6. Select I to proceed. This menu option applies the settings you have already configured for your NorthStar HA cluster, so you do not need to make any changes.
    Note

    Depending on the geographical location of the nodes, you might want to use analytics geo-HA instead of setting up internal analytics. In that case, instead of selecting I, you would select G to access the Analytics Data Collector Configuration Settings. After updating those settings, select C (Prepare and Deploy GEO-HA Analytics Data Collector Setting). Step 7 below would not apply.

  7. Select 1 to set up the NorthStar HA cluster for analytics.
  8. On any analytics node, use the following command to check elasticsearch cluster status. Verify that the status is “green” and the number of nodes is correct.

Troubleshooting Logs

The following logs are available to help with troubleshooting:

  • /opt/northstar/logs/elasticsearch.msg

  • /opt/northstar/logs/logstash.msg

  • /opt/northstar/logs/logstash.log

See Logs in the NorthStar Controller User Guide for more information.