Ceph is a unified, distributed storage system that provides object storage and block storage. AppFormix monitors Ceph performance, availability, and usage, with both charts and alarms.
In addition, AppFormix Agent can be installed on the Ceph object storage daemon (OSD) and monitor hosts, for real-time health and performance monitoring of the storage hosts that power a Ceph storage cluster.
Ceph Service Monitoring
From the context menu, select Services > Ceph. The Ceph service monitoring page displays a summary of the current usage of a Ceph cluster, including total cluster capacity, used capacity, and number of OSDs, pools, objects. The Health Status table displays errors and warnings of your Ceph cluster. Details about usage of each storage pool are shown in table and chart views.
Figure 1 shows the Ceph service monitoring page and storage pool usage details in a table.
Figure 2 shows the Ceph service monitoring page and storage pool usage details in a chart.
Monitoring Ceph OSD and Monitor Nodes
With AppFormix Agent installed on the Ceph storage hosts, details
are available about each OSD and Monitor node in the cluster. Using
the context menu, select Services > Ceph > Nodes. Each
host in the list has a tag of
ceph-monitor. When a host with a
ceph-osd tag is selected, a summary of host performance metrics are shown,
as well as the health and status of each OSD on the host. See Figure 3 for an example summary.
Alarms can be configured to monitor the Ceph cluster metrics at the cluster, pool, or host level.
To configure an alarm for cluster-wide and per-pool metrics, select Alarms in the left menu. Choose the Service Alarms module, and select ceph from the Service drop-down list. Ceph service alarms can be created to monitor a cluster or a pool. With cluster scope, an alarm can be configured for cluster-wide metrics, such as the cluster storage usage. With pool scope, an alarm can be configured to monitor per-pool metrics for one or multiple pools.
To configure an alarm for a Ceph storage host, select the Alarms module in the Alarms pane. An alarm can be configured for one or multiple Ceph storage hosts. See Configuring Alarms in Alarms for details.
See Service Monitoring for steps to configure AppFormix using Ansible to monitor a Ceph cluster.
Contrail Networking is a software-defined networking (SDN) platform based on the open-source network virtualization project, OpenContrail. The Juniper platform automates and orchestrates the creation of highly scalable virtual networks.
AppFormix provides monitoring and orchestration for the OpenContrail Service. See the Service Monitoring instructions for how to configure Contrail monitoring.
AppFormix service monitoring Dashboard for a Contrail cluster displays the overall state of the cluster and its components.
AppFormix provides real-time liveness for each Contrail service part of the four Contrail service groups - Analytics Nodes, Config Nodes, Controller Nodes, and DB Nodes running on all the hosts configured during the Contrail installation. Figure 6 shows real-time liveness for each Contrail service.
AppFormix also provides a historical liveness view of each Contrail service. Figure 7 show a historical liveness view.
In addition, any alarm generated by the Contrail Service can also be accessed on the AppFormix Dashboard. Figure 8 shows examples of Contrail service alarms.
AppFormix monitors the real-time status of every element of the Contrail cluster. You can select an element from the Group drop-down list for the Contrail Service. For example, if Analytics Nodes service group is selected, the Dashboard displays each service on every host that is configured for that particular service group. Liveness statistics and basic metrics are also available for each service in this view. Figure 9 shows statistics and metrics for the Contrail analytics nodes.
For Contrail Config Nodes, AppFormix enables a Peer view for XMPP and BGP peers. The information provides some rx and tx reachability statistics, as shown in Figure 10.
An alarm can be configured for any of the Contrail metrics collected. In the Alarm panel, select the Alarms module. Then select Contrail from the Scope drop-down list. Additionally, notifications can also be configured for Contrail alarms. Figure 11 shows the Alarm pane for configuring Contrail alarms. See Alarms and Notifications.
Entity Type and Entity Names are mandatory fields.
Health/Risk for Contrail BGP Peers and XMPP Peers
In addition to Health/Risk on Contrail services, (preconfigured by AppFormix), you can set Health or Risk rules for two additional modules with the following steps:
- Select Settings in the upper left corner. Then select SLA Settings > Health or Risk > Contrail tab.
- Click Delete Profile.
- Select Add New Rule.
- In the Add New Rule pane, for Entity Type select BGP Peers or XMPP Peers.
- Select both of the rules, then select Create Profile.
- Select Infrastructure > Service > Contrail. Then select Config Nodes > XPP Peer.
Flow Monitoring with Contrail vRouter
When the Contrail vRouter is installed on a compute node, AppFormix provides debug mode functionality in the Network Topology panel. In this mode, the top flows on each compute node are available for visualization with details on flow tuples, packets, and bytes. Figure 15 shows the flow monitoring details and visualization.
In debug mode, you can analyze details on the top-n flows on any compute part of the Network Topology view. Figure 16 shows the Contrail flow details.
Contrail service monitoring is supported by following AppFormix adapters:
Network Device Adapter
Network Device Adapter for monitoring Contrail service can only be used when Contrail Analytics endpoints are not authenticated.
If more than one adapters are deployed, there is internal precedence to decide which adapter should monitor Contrail. Precedence ranking is as follows: Openstack, Kubernetes, Network Device Adapter.
In order for AppFormix to monitor Contrail metrics, the AppFormix
Platform host must be able to open connections to the Analytics API
and Config API. For example, ports
8082 on the Contrail controller.
Contrail cluster connection details can be configured in AppFormix Dashboard or the Ansible playbooks.
To configure Contrail cluster connection details from the Dashboard:
- Select Settings > Service Settings. Then select the Contrail tab, as shown in Figure 17.
- Click Add Cluster. Enter the cluster name,
analytics URL, and configuration URL. The URLs should specify only
the protocol, address, and optionally port.
http://contrail.example.com:8081for the analytics URL and
http://contrail.example.com:8082for the configuration URL.
- Click Setup. On success, a Submission Successful message appears in the Dashboard.
For configuration using Ansible playbooks, see Service Monitoring for steps to configure AppFormix to monitor a Contrail cluster.
Configure Dynamic Alarms Data Purge Rate
To configure the dynamic alarms data purge rate, select Settings in the upper right corner, then select AppFormix Settings > Storage. Make sure there is non-zero value for Dynamic Alarm Training Data and Service Availability Data.
A MySQL database is integral to the operation of OpenStack infrastructure services. Metrics for MySQL performance are available in real-time charts and alarms. Mulitple MySQL clusters can be configured to be monitored.
The availability of MySQL nodes for each of the configured MySQL clusters is recorded periodically. You can view both the current status, as well as the historical status over a specified period of time by selecting All Services > MySQL from the context menu at the top and, then select Dashboard from the left pane. Figure 19 shows the historical resource availability for the MySQL nodes.
Figure 20 shows the real-time resource availability for the MySQL nodes.
Each MySQL cluster has a dashboard displaying real-time usage metrics for each of its nodes, as shown in Figure 21.
From the context menu, select All Services > MySQL. Click the Charts icon from the left navigation pane. Figure 22 shows MySQL performance metric charts.
An alarm can be configured for any of the MySQL metrics collected. In the Alarm pane, select the Service Alarms module. Then select mysql from the Service drop-down list. MySQL alarms can be created for one or more MySQL nodes. Additionally, Notifications can also be configured for MySQL Alarms. Figure 23 shows the Alarm Input pane for MySQL alarm configuration.
For AppFormix to monitor MySQL metrics, there must exist a MySQL user with remote, read-permission. In this topic, we create a new user with read-only access to the database. Alternately, an existing user account can be used.
To configure MySQL monitoring:
- Create a read-only user account 'appformix' that can access
the MySQL database from any host:
$ mysql -u root -p mysql> grant SELECT on *.* to ''appformix’'@’'%’'' identified by 'mypassword'; mysql> flush privileges;
Change 'mypassword' to a strong password. Optionally, you may restrict the 'appformix' account to only connect from a specific IP address or hostname by replacing '%' with the host on which AppFormix Controller runs.
- Next, configure the MySQL connection details in AppFormix. From the Settings menu, select Service Settings. Then, select the MySQL tab.
- Enter the host and port on which MySQL runs. The default port for MySQL is 3306.
- Enter the username and password from Step 1. Finally, click the Setup button. On success, the button changes to Submitted. Figure 24 shows MySQL connection and credential settings.
OpenStack Services Monitoring
AppFormix monitors Keystone, Nova, and Neutron services that power the OpenStack cloud management system. AppFormix performs status checks for processes that implement the services on both controller and compute hosts.
AppFormix monitors the overall connectivity to each API and the status of components that comprise the service.
Overall connectivity is monitored by issuing an API call to get the component service list in the case of Nova and Keystone, or the agent list in the case of Neutron. The status of this check is reflected in
default_openstack_cluster_statusfor each of Keystone, Nova, and Neutron. If the API call is successful, the
Latency of the API is recorded. An alarm can be configured for the API latency metric.
Each of the above API calls returns a list of sub-services. AppFormix examines the statuses of these individual sub-services. AppFormix displays the health of each sub-service in the list.
For example, if the nova-api sub-service is up and responds
to the API call successfully, then the Health of the
default_openstack_cluster_status for Nova will be
Good - even if an individual sub-service of Nova has
failed. As an alternative example, consider if nova-scheduler is
not running. If the API call to list the status of Nova sub-services
succeeds, then the
default_openstack_cluster_status will be
Good but Health of the
nova-scheduler will be
You can view both the current status and the historical status over a specified period of time in the Dashboard. Select the name of a service from Services in the context menu at the top, and select Dashboard from the left pane.
Figure 25 shows the OpenStack Keystone nodes real-time availability.
Figure 26 shows the OpenStack Keystone nodes historical availability.
Figure 27 shows the OpenStack Nova nodes real-time availability.
Figure 28 shows the OpenStack Nova nodes historical availability.
Figure 33 shows the OpenStack Neutron nodes real-time availability.
Figure 30 shows the OpenStack Neutron nodes historical availability.
An alarm can be configured for any of the OpenStack services. In the Alarm pane, select the Service Alarms module. Then, select openstack from the Service drop-down list. The metrics for which alarms can be configured are broadly categorized into three scopes:
As with other alarms, notifications can also be configured for any OpenStack service alarm, as shown in Figure 31.
SLA profiles can be configured for Nova, Neutron, and Keystone by navigating to Settings > SLA Settings. Then select the appropriate tab for the service. A list of rules can be defined for both Health and Risk.
The OpenStack configuration parameters provided during AppFormix installation are sufficient for monitoring OpenStack services. No additional configuration is required. To modify the current values, from the Settings menu, select Service Settings. Then select the OpenStack Services tab. Figure 32 shows the OpenStack services settings and configuration parameters.
OpenStack depends on RabbitMQ to deliver messages between services. AppFormix Service Monitoring can be used to monitor RabbitMQ metrics through real-time charts. Service alarms can also be configured for these metrics.
The connectivity of nodes for each of the configured Rabbit clusters is recorded periodically. You can view both the current status, as well as the historical status over a specified period of time by selecting Services > RabbitMQ from the context menu at the top, and selecting Dashboard in the left pane.
The Dashboard also provides detailed metrics for a single RabbitMQ cluster, as shown in Figure 33. Select Dashboard in the left pane, then Services > RabbitMQ in the top context menu, and then select a Rabbit Cluster by name.
The counters in the top pane display the number of active channels, connections, consumers, exchanges, and queues. Below, tables display statistics about message rates across the cluster, and per-node resource consumption.
For a real-time view of RabbitMQ metrics, select All Services > RabbitMQ from the context menu. Next, click the Charts icon in the left pane. Figure 34 shows RabbitMQ real-time metric charts.
To configure an alarm to monitor RabbitMQ metrics, select Alarms to open the Alarm pane. See Alarms. Select Service_Alarms for the module and rabbit for the service. An alarm can be configured for a metric on a per-cluster, per-node, or per-queue basis. Select the appropriate metric scope, and then choose a metric to monitor. As with other alarms, you can optionally configure Notifications in the Advanced settings. Figure 35 shows the RabbitMQ alarm configuration pane.
For AppFormix to be able to collect metrics from RabbitMQ, the RabbitMQ management plug-in must be enabled, and AppFormix must be configured with user credentials to collect RabbitMQ metrics.
To configure RabbitMQ monitoring:
- Enable the RabbitMQ plugin by issuing the following commands
on the host that runs RabbitMQ:
$ rabbitmq-plugins enable rabbitmq_management $ service rabbitmq-server restart
- AppFormix requires RabbitMQ user credentials with privileges
to read the metrics. You can use an existing RabbitMQ user with an administrator or monitoring role,
or create a new user account. To create a user account with “monitoring”
privileges, issue the following commands on the host that run RabbitMQ:"" "" ".*"
$ rabbitmqctl add_user appformix mypassword $ rabbitmqctl set_user_tags appformix monitoring $ rabbitmqctl set_permissions -p / appformix "" "" ".*"
Replace the sample
mypasswordwith a strong password.
- Verify the settings by opening http://<rabbit-host>:15672/ in a Web browser, and log in with the RabbitMQ user credentials.
- Configure AppFormix with the details of the RabbitMQ cluster.
Click Settings from the Dashboard. In the Services Settings
page, select the RabbitMQ tab.
Enter the Rabbit Cluster URL from Step 1. Enter the username and password from Step 2. Click Setup. On success, the button changes to Submitted. Figure 36 shows the RabbitMQ URL and credential settings.
ScaleIO provides software-defined block storage. AppFormix metrics for ScaleIO performance and availability are available in real-time charts and alarms.
The AppFormix service monitoring dashboard for a ScaleIO cluster displays the overall state of the cluster and its components. It also displays real-time storage capacity and read/write bandwidths of the cluster, as shown in Figure 37.
To view cluster-wide metrics in the charts, select Services > ScaleIO from the top context menu. Select the Charts icon from the left pane. Figure 38 shows the ScaleIO service summary of cluster metrics in a chart view.
Real-Time Status of ScaleIO Components
AppFormix monitors the real-time status of every element of the ScaleIO cluster. You can select an element from the Resource drop-down list.
Figure 39 shows the real-time status of SDS elements of the ScaleIO cluster.
Figure 40 shows the real-time status of SDC elements of the ScaleIO cluster.
Figure 41 shows the real-time status of the protection domains of the ScaleIO cluster.
Figure 42 shows the real-time status of the storage pools of the ScaleIO cluster.
Figure 43 shows the real-time status of the devices of the ScaleIO cluster.
Figure 44 shows the real-time status of the volumes of the ScaleIO cluster.
An alarm can be configured for any of the ScaleIO metrics collected. In the Alarm pane, select the Service Alarms module. Then select scaleio from the Service drop-down list. Additionally, notifications can also be configured for ScaleIO alarms, as shown in Figure 45.
Per-Instance Storage Volume Metrics
When a virtual machine mounts a storage volume, AppFormix Agent
monitors the disk latency and throughput to the network attached storage
volume. Instance metrics for storage I/O and latency (such as
disk.* metrics) are available on a per-volume basis
in the charts. An alarm on such a metric will indicate the volume
for which the alarm triggered.
For AppFormix to monitor ScaleIO metrics, there must exist a ScaleIO user with admin authorization of the cluster. ScaleIO cluster connection details can be configured in AppFormix. From the Settings menu, select Service Settings. Then, select the ScaleIO tab.
Enter the cluster name and host on which ScaleIO runs. Enter the username and password, then click Setup. On success, the button changes to Submitted. Figure 46 shows the ScaleIO services and credentials settings.
The OpenStack Object Store project, known as Swift, offers cloud storage software so that you can store and retrieve lots of data with a simple API. It's built for scale and optimized for durability, availability, and concurrency across the entire data set. Swift is ideal for storing unstructured data that can grow without bound.
OpenStack Swift Service Hierarchy
The Object Storage system organizes data in a hierarchy, as follows:
AppFormix provides an easy way for you to examine the object storage usage of your OpenStack cluster. AppFormix automatically discovers all of the Swift Containers in your OpenStack cluster and shows you the details of these discovered Swift Containers. AppFormix syncs with OpenStack every minute and updates the Swift Containers information.
Select Dashboard > Services > Swift to view all of the Swift Containers in your OpenStack cluster in the AppFormix Dashboard, as shown in Figure 47.
AppFormix provides the following information for a Swift Container: Project Name, Container Name, Container Id, Container Size, and Object Count. Figure 48 shows an example of the Swift Container displaying in the AppFormix Dashboard.
There is no explicit configuration for the Swift service. Swift service will be discovered as part of an OpenStack cluster, using the OpenStack credentials provided during installation.