Navigation
Guide That Contains This Content
[+] Expand All
[-] Collapse All

    Metrics

    A metric is a measured value for an element in the infrastructure. AppFormix Agent collects and calculates metrics for hosts and instances. AppFormix metrics are organized into hierarchical categories based on the type of metric.

    Some metrics are percentages of total capacity. In such cases, the category of the metric determines the total capacity by which the percentage is computed. For instance, host.cpu.usage indicates the percentage of CPU consumed relative to the total CPU available on a host. In contrast, instance.cpu.usage is the percentage of CPU consumed relative to the total CPU available to an instance. As an example, consider an instance that is using 50% of one core on a host with 20 cores. The instance's host.cpu.usage will be 2.5%. If the instance has been allocated two cores, then its instance.cpu.usage will be 25%.

    Alarms can be configured for any metric. Many metrics can also be displayed in charts. When an alarm triggers for a metric, the alarm is plotted on charts at the time of the event. In this way, metrics that cannot be plotted directly as a chart are still visually correlated in time with other metrics.

    AppFormix Agent collects both raw metrics and calculated metrics. Raw metrics are values read directly from the underlying infrastructure. Calculated metrics are metrics that AppFormix Agent derives from raw metrics.

    Hosts

    Table 1 lists the raw metrics available for hosts.

    Table 1: Raw Metrics for Hosts

    Metric

    Chart

    Alarm

    host.cpu.io_wait

    x

    x

    host.cpu.ipc **

    x

    x

    host.cpu.l3_cache.miss **

    x

    x

    host.cpu.l3_cache.usage **

    x

    x

    host.cpu.mem_bw.local **

    x

    x

    host.cpu.mem_bw.remote **

    x

    x

    host.cpu.mem_bw.total **

    x

    x

    host.cpu.usage

    x

    x

     

    host.disk.io.read

    x

    x

    host.disk.io.write

    x

    x

    host.disk.response_time

    x

    x

    host.disk.read_response_time

    x

    x

    host.disk.write_response_time

    x

    x

    host.disk.smart.hdd.command_timeout

     

    x

    host.disk.smart.hdd.current_pending_sector_count

     

    x

    host.disk.smart.hdd.offline_uncorrectable

     

    x

    host.disk.smart.hdd.reallocated_sector_count

     

    x

    host.disk.smart.hdd.reported_uncorrectable_errors

     

    x

    host.disk.smart.ssd.available_reserved_space

     

    x

    host.disk.smart.ssd.media_wearout_indicator

     

    x

    host.disk.smart.ssd.reallocated_sector_count

     

    x

    host.disk.smart.ssd.wear_leveling_count

     

    x

    host.disk.usage.bytes

    x

    x

    host.disk.usage.percent

    x

    x

     

    host.memory.usage

    x

    x

    host.memory.swap.usage

    x

    x

    host.memory.dirty.rate

    x

    x

    host.memory.page_fault.rate

    x

    x

    host.memory.page_in_out.rate

    x

    x

     

    host.network.egress.bit_rate

    x

    x

    host.network.egress.drops

    x

    x

    host.network.egress.errors

    x

    x

    host.network.egress.packet_rate

    x

    x

    host.network.ingress.bit_rate

    x

    x

    host.network.ingress.drops

    x

    x

    host.network.ingress.errors

    x

    x

    host.network.ingress.packet_rate

    x

    x

    host.network.ipv4tables.rule_count

    x

    x

    host.network.ipv6tables.rule_count

    x

    x

     

    openstack.host.disk_allocated

    x

    x

    openstack.host.memory_allocated

    x

    x

    openstack.host.vcpus_allocated

    x

    x

    Note: ** CPU cache and memory bandwidth metrics are available for Intel© Xeon© processor family with Intel© Resource Directory Technology. The AppFormix software automatically detects the processor family and makes the additional metrics available for display and analysis.

    Table 2 lists the calculated metrics available for hosts.

    Table 2: Calculated Metrics for Hosts

    Metric

    Chart

    Alarm

    host.cpu.normalized_load_1m

    x

    x

    host.cpu.normalized_load_5m

    x

    x

    host.cpu.normalized_load_15m

    x

    x

    host.cpu.temperature

     

    x

     

    host.disk.smart.predict_failure

     

    x

     

    host.heartbeat

     

    x

    host.cpu.normalized_loadNormalized load is calculated as a ratio of the number of running and ready-to-run threads to the number of CPU cores. This family of metrics indicate the level of demand for CPU. If the value exceeds 1, then more threads are ready to run than exists CPU cores to perform the execution. Normalized load is a provided as an average over 1-minute, 5-minute, and 15-minute intervals.
    host.cpu.temperatureCPU temperature is derived from multiple temperature sensors in the processor(s) and chassis. This temperature provides a general indicator of temperature in degrees Celsius inside a physical host.
    host.disk.smart.predict_failureAppFormix Agent calculates predict_failure using multiple S.M.A.R.T. counters provided by disk hardware. The agent will set predict_failure to true (value=1) when it determines from a combination of S.M.A.R.T. counters that a disk is likely to fail. An alarm triggered for this metric contains the disk identifier in the metadata.
    host.heartbeatThe host.heartbeat indicates if AppFormix Agent is functioning on a host. AppFormix Controller periodically checks the status of each host by making a status request to AppFormix Agent. The host.heartbeat metric is incremented for each successful response. Alarms can be configured to detect missed heartbeats over a given interval.

    Instances

    Table 3 lists the raw metrics available for instances.

    Table 3: Raw Metrics for Instances

    Metric

    Chart

    Alarm

    instance.cpu.usage

    x

    x

    instance.cpu.ipc **

    x

    x

    instance.cpu.l3_cache.miss **

    x

    x

    instance.cpu.l3_cache.usage **

    x

    x

    instance.cpu.mem_bw.local **

    x

    x

    instance.cpu.mem_bw.remote **

    x

    x

    instance.cpu.mem_bw.total **

    x

    x

     

    instance.disk.io.read_bandwidth

    x

    x

    instance.disk.io.read_iops

    x

    x

    instance.disk.io.read_iosize

    x

    x

    instance.disk.io.read_response_time

    x

    x

    instance.disk.io.write_bandwidth

    x

    x

    instance.disk.io.write_iops

    x

    x

    instance.disk.io.write_iosize

    x

    x

    instance.disk.io.write_response_time

    x

    x

    instance.disk.usage.bytes

    x

    x

    instance.disk.usage.percentage

    x

    x

     

    instance.memory.usage

    x

    x

     

    instance.network.egress.bit_rate

    x

    x

    instance.network.egress.drops

    x

    x

    instance.network.egress.errors

    x

    x

    instance.network.egress.packet_rate

    x

    x

    instance.network.egress.total_bytes

    x

    x

    instance.network.egress.total_packets

    x

    x

    instance.network.ingress.bit_rate

    x

    x

    instance.network.ingress.drops

    x

    x

    instance.network.ingress.errors

    x

    x

    instance.network.ingress.packet_rate

    x

    x

    instance.network.ingress.total_bytes

    x

    x

    instance.network.ingress.total_packets

    x

    x

    Note: ** CPU cache and memory bandwidth metrics are available for Intel© Xeon© processor family with Intel© Resource Directory Technology. The AppFormix software automatically detects the processor family and makes the additional metrics available for display and analysis.

    Table 4 lists the calculated metric available for instances.

    Table 4: Calculated Metrics for Instances

    Metric

    Chart

    Alarm

    instance.heartbeat

     

    x

    instance.heartbeatThe instance.heartbeat indicates whether an instance is running. AppFormix Agent periodically checks the state of host processes associated with each instance. The instance.heartbeat metric is incremented for each successful status check. Alarms may be configured to detect missed heartbeats over a given interval.

    Network Devices

    AppFormix can collect network device metrics using SNMP or Juniper Telemetry Interface (JTI). See Network Devices for details.

    Table 5 lists the metrics available per interface with SNMP network device monitoring.

    Table 5: Metrics Available per Interface with SNMP Network Device Monitoring

    Metric

    Unit

    Chart

    Alarm

    interface.out_discards

    discards/s

    x

    x

    interface.in_discards

    discards/s

    x

    x

    interface.in_errors

    errors/s

    x

    x

    interface.out_unicast_packets

    packets/s

    x

    x

    interface.in_octets

    octets/s

    x

    x

    interface.in_unicast_packets

    packets/s

    x

    x

    interface.out_packet_queue_length

    count

    x

    x

    interface.speed

    bits/s

    x

    x

    interface.out_octets

    octets/s

    x

    x

    interface.in_unknown_protocol

    packets/s

    x

    x

    interface.in_non_unicast_packets

    packets/s

    x

    x

    interface.out_errors

    errors/s

    x

    x

    interface.out_non_unicast_packets

    packets/s

    x

    x

    Table 6 lists the metrics available per interface with JTI network device monitoring.

    Table 6: Metrics Available per Interface with JTI Network Device Monitoring

    Metric

    Unit

    Chart

    Alarm

    interface.egress_errors.if_errors

    errors/s

    x

    x

    interface.egress_errors.if_discard

    discards/s

    x

    x

    interface.egress_stats.if_1sec_pkts

    packets/s

    x

    x

    interface.egress_stats.if_octets

    octets/s

    x

    x

    interface.egress_stats.if_mc_pkts

    packets/s

    x

    x

    interface.egress_stats.if_bc_pkts

    packets/s

    x

    x

    interface.egress_stats.if_1sec_octets

    octets/s

    x

    x

    interface.egress_stats.if_pkts

    packets/s

    x

    x

    interface.egress_stats.if_uc_pkts

    packets/s

    x

    x

    interface.egress_stats.if_pause_pkts

    packets/s

    x

    x

    interface.ingress_errors.if_in_fifo_errors

    errors/s

    x

    x

    interface.ingress_errors.if_in_frame_errors

    errors/s

    x

    x

    interface.ingress_errors.if_in_l3_incompletes

    packets/s

    x

    x

    interface.ingress_errors.if_in_runts

    packets/s

    x

    x

    interface.ingress_errors.if_errors

    errors/s

    x

    x

    interface.ingress_errors.if_in_l2chan_errors

    errors/s

    x

    x

    interface.ingress_errors.if_in_resource_errors

    errors/s

    x

    x

    interface.ingress_errors.if_in_qdrops

    drops/s

    x

    x

    interface.ingress_errors.if_in_l2_mismatch_timeouts

    packets/s

    x

    x

    interface.ingress_stats.if_1sec_pkts

    packets/s

    x

    x

    interface.ingress_stats.if_octets

    octets/s

    x

    x

    interface.ingress_stats.if_mc_pkts

    packets/s

    x

    x

    interface.ingress_stats.if_bc_pkts

    packets/s

    x

    x

    interface.ingress_stats.if_1sec_octets

    octets/s

    x

    x

    interface.ingress_stats.if_error

    errors/s

    x

    x

    interface.ingress_stats.if_pkts

    packets/s

    x

    x

    interface.ingress_stats.if_uc_pkts

    packets/s

    x

    x

    interface.ingress_stats.if_pause_pkts

    packets/s

    x

    x

    Table 7 lists the metrics available per interface queue with JTI network device monitoring.

    Table 7: Metrics Available per Interface Queue with JTI Network Device Monitoring

    Metric

    Unit

    Chart

    Alarm

    interface.egress_queue_info.peak_buffer_occupancy

    count/s

    x

    x

    interface.egress_queue_info.rl_drop_bytes

    drops/s

    x

    x

    interface.egress_queue_info.packets

    packets/s

    x

    x

    interface.egress_queue_info.rl_drop_packets

    drops/s

    x

    x

    interface.egress_queue_info.bytes

    bytes/s

    x

    x

    interface.egress_queue_info.allocated_buffer_size

    count/s

    x

    x

    interface.egress_queue_info.tail_drop_packets

    drops/s

    x

    x

    interface.egress_queue_info.red_drop_packets

    drops/s

    x

    x

    interface.egress_queue_info.red_drop_bytes

    drops/s

    x

    x

    interface.egress_queue_info.cur_buffer_occupancy

    count/s

    x

    x

    interface.egress_queue_info.avg_buffer_occupancy

    count/s

    x

    x

    OpenContrail vRouter on a Host

    Table 8 lists raw metrics available for an OpenContrail vRouter on a host.

    Table 8: Raw Metrics for OpenContrail vRouter

    Metric

    Chart

    Alarm

    plugin.contrail.vrouter.aged_flows

    x

    x

    plugin.contrail.vrouter.total_flows

    x

    x

    plugin.contrail.vrouter.exception_packets

    x

    x

    plugin.contrail.vrouter.drop_stats_flow_queue_limit_exceeded

    x

    x

    plugin.contrail.vrouter.drop_stats_flow_table_full

    x

    x

    plugin.contrail.vrouter.drop_stats_vlan_fwd_enq

    x

    x

    plugin.contrail.vrouter.drop_stats_vlan_fwd_tx

    x

    x

    plugin.contrail.vrouter.flow_export_drops

    x

    x

    plugin.contrail.vrouter.flow_export_sampling_drops

    x

    x

    plugin.contrail.vrouter.flow_rate_active_flows

    x

    x

    plugin.contrail.vrouter.flow_rate_added_flows

    x

    x

    plugin.contrail.vrouter.flow_rate_deleted_flows

    x

    x

    OpenStack Project in Chart View

    Table 9 lists the raw metrics available in the OpenStack Project Chart View.

    Table 9: Raw Metrics for OpenStack Project

    Metric

    Chart

    Alarm

    openstack.project.active_instances

    x

    x

    openstack.project.vcpus_allocated

    x

    x

    openstack.project.volume_storage_allocated

    x

    x

    openstack.project.memory_allocated

    x

    x

    openstack.project.floating_ip_count

    x

    openstack.project.security_group_count

    x

    x

    openstack.project.volume_count

    x

    x

    Modified: 2017-11-12