Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

 

Error Notifications for JSA Appliances

 

Error notifications in JSA products require a response by the user or the administrator.

Out Of Memory Error

38750004 - Application ran out of memory

Explanation

When JSA components attempt to use more than the amount allocated for memory, the application or service can stop working. Out of memory issues are caused by software, or user-defined queries and operations that exhaust the available memory.

User Response

Review the following resolutions:

  • Review the error message that is written to the /var/log/qradar.log file to determine which component failed.

  • If the Ariel proxy server is searching through large amounts of data or is using a grouping option that generates unique values in the search results, reduce the number of unique values or reduce the time frame of the search.

  • If the accumulator is generating a time series graph with many aggregated unique values, reduce the size of the query.

  • If a protocol-based log source is recently enabled, decrease the polling period to reduce the data queried. If multiple protocol-based log sources are running at the same time, stagger the start times.

  • If a rule recently changed to track unique properties over long periods of time, reduce the time frame by half or reduce the number of matching events by adding another filter.

Disk Usage Exceeded Threshold

38750038 - Disk Sentry: Disk Usage Exceeded Max Threshold.

Explanation

At least one disk on your system is 95% full.

To prevent data corruption, some processes are shut down. Event collection is suspended until the disk usage falls below 92%.

User Response

Identify which partition is full, such as the / and /store file systems. Free disk space by deleting files that are not required. For example, remove debug output and patch files from the / file system. If the /store file system is near capacity, reduce your retention settings for events and flows.

You can also manually delete older data in the /store/ariel/ directories. The system automatically restarts processes after you free enough disk space to fall below a threshold of 92% capacity.

Process Monitor Application Failed to Start Multiple Times

38750043 - Process Monitor: Application has failed to start up multiple times.

Explanation

The system is unable to start an application or process on your system.

User Response

Review which components are failing. For example, JSA Flow Processor fails to start when no flow sources are assigned. Use the deployment actions to remove that Flow component.

Process Monitor Must Lower Disk Usage

38750045 - Process Monitor: Disk usage must be lowered.

Explanation

The process monitor is unable to start processes because of a lack of system resources. The storage partition on the system is likely 95% full or greater.

User Response

Free some disk space by manually deleting files or by changing your event or flow data retention policies. The system automatically restarts system processes when the used disk space falls below a threshold of 92% capacity.

Event Pipeline Dropped Events

38750060 - Events/Flows were dropped by the event pipeline.

Explanation

If there is an issue with the event pipeline or you exceed your license limits, an event or flow might be dropped.

Dropped events and flows cannot be recovered.

User Response

Review the following options:

  • Verify the incoming event and flow rates on your system. If the license is exceeded and the event pipeline is dropping events, expand your license to handle more data.

  • Review the recent changes to rules or custom properties. Rule or custom property changes can cause changes to your event or flow rates and might affect system performance.

  • Determine whether the issue is related to SAR notifications. SAR notifications might indicate that queued events and flows are in the event pipeline. The system usually routes events to storage, instead of dropping the events.

  • Tune the system to reduce the volume of events and flows that enter the event pipeline.

Event Pipeline Dropped Connections

38750061 - Connections were dropped by the event pipeline.

Explanation

A TCP-based protocol dropped an established connection to the system.

The number of connections that can be established by TCP-based protocols is limited to ensure that connections are established and events are forwarded. The event collection service (ECS) allows a maximum of 15,000 file handles and each TCP connection uses three file handles.

TCP protocols that provide drop connection notifications include the following protocols:

  • TCP syslog protocol

  • TLS syslog protocol

  • TCP multi-line protocol

User Response

Review the following options:

  • Distribute events to more appliances. Connections to other event and flow processors distribute the work load from the console.

  • Configure low priority TCP log source events to use the UDP network protocol.

  • Tune the system to reduce the volume of events and flows that enter the event pipeline.

Automatic Update Error

38750066 - Automatic updates could not complete installation. See the Auto Update Log for details.

Explanation

The update process encountered an error or cannot connect to an update server. The system is not updated.

User Response

Select one of the following options:

  • Verify the automatic update history to determine the cause of the installation error.

    In the Admin tab, click the Auto Update icon and select View Log.

  • Verify that your console can connect to the update server.

    In the Updates window, select Change Settings, then click the Advanced tab to view your automatic update configuration. Verify the address in the Web Server field to ensure that the automatic update server is accessible.

Auto Update Installed with Errors

38750067 - Automatic updates installed with errors. See the Auto Update Log for details.

Explanation

The most common reason for automatic update errors is a missing software dependency for a DSM, protocol, or scanner update.

User Response

Select one of the following options:

  • In the Admin tab, click the Auto Update icon and select View Update History to determine the cause of the installation error. You can view, select, and then reinstall a failed RPM.

  • If an auto update is unable to reinstall through the user interface, manually download and install the missing dependency on your console. The console replicates the installed file to all managed hosts.

Standby High-availability (HA) System Failure

38750080 - Standby HA System Failure.

Explanation

The status of the secondary appliance switches to failed and the system has no HA protection.

User Response

Review the following resolutions:

  • Restore the secondary system.

    Click the Admin tab, click System and License Management, and then click Restore System.

  • Inspect the secondary HA appliance to determine whether it is powered down or experienced a hardware failure.

  • Use the ping command to check the communication between the primary and standby system.

  • Check the switch that connects the primary and secondary HA appliances.

    Verify the IPtables on the primary and secondary appliances.

  • Review the /var/log/qradar.log file on the standby appliance to determine the cause of the failure.

Active High-availability (HA) System Failure

38750081 - Active HA System Failure.

Explanation

The active system cannot communicate with the standby system because the active system is unresponsive or failed. The standby system takes over operations from the failed active system.

User Response

Review the following resolutions:

  • Inspect the active HA appliance to determine whether it is powered down or experienced a hardware failure.

  • If the active system is the primary HA, restore the active system.

    Click the Admin tab and click System and License Management. From the High Availability menu, select the Restore System option.

  • Review the /var/log/qradar.log file on the standby appliance to determine the cause of the failure.

  • Use the ping command to check the communication between the active and standby system.

  • Check the switch that connects the active and standby HA appliances.

    Verify the IPtables on the active and standby appliances.

Failed to Install High Availability

38750086 - There was a problem installing High Availability on the cluster.

Explanation

When you install a high availability (HA) appliance, the installation process links the primary and secondary appliances. The configuration and installation process contains a time interval to determine when an installation requires attention. The high-availability installation exceeded the six-hour time limit.

No HA protection is available until the issue is resolved.

User Response

Contact Juniper Customer Support.

Failed to Uninstall a High-availability (HA) Appliance

38750087 - There was a problem while removing High Availability on the cluster.

Explanation

When you remove a HA appliance, the installation process removes connections and data replication processes between the primary and secondary appliances. If the installation process cannot remove the HA appliance from the cluster properly, the primary system continues to work normally.

User Response

Try to remove the HA appliance a second time.

Scanner Initialization Error

38750089 - A scanner failed to initialize.

Explanation

A scheduled vulnerability scan is unable to connect to an external scanner to begin the scan import process.

Scan initialization issues are typically caused by credential problems or connectivity issues to the remote scanner. Scanners that fail to initialize display detailed error messages in the hover text of a scheduled scan with a status of failed.

User Response

Follow these steps:

  1. Click the Admin tab.

  2. On the navigation menu, click Data Sources.

  3. Click Schedule VA Scanners icon.

  4. From the scanner list, hover the cursor in the Status column of any scanner to display a detailed success or failure message.

Scan Failure Error

38750090 - A scanner has failed.

Explanation

A scheduled vulnerability scan failed to import vulnerability data. Scan failures are typically caused by configuration or performance issues that result from a large volume of data to import. Scan failures can also occur when a scan report that is downloaded by the system is in an unreadable format.

User Response

Follow these steps:

  1. Click the Admin tab.

  2. On the navigation menu, click Data Sources.

  3. Click Schedule VA Scanners.

  4. From the scanner list, hover the cursor in the Status column of any scanner to display a detailed success or failure message.

Filter Initialization Failed

38750091 - Traffic analysis filter failed to initialize.

Explanation

If a configuration is not saved correctly, or if a configuration file is corrupted, the event collection service (ECS) might fail to initialize. If the traffic analysis process is not started, new log sources are not automatically discovered.

User Response

Select one of the following options:

  • Manually create log sources for any new appliances or event sources until traffic analysis process is working.

    All new event sources are classified as SIM Generic until they are mapped to a log source.

  • If you get an automatic update error, review the automatic update log to determine whether an error occurred when a DSM or a protocol was installed.

Disk Storage Unavailable

38750092 - Disk Sentry has detected that one or more storage partitions are not accessible.

Explanation

The disk sentry did not receive a response within 30 seconds. A storage partition issue might exist, or the system might be under heavy load and not able to respond within the 30-second threshold.

User Response

Select one of the following options:

  • Verify the status of your /store partition by using the touch command.

If the system responds to the touch command, the unavailability of the disk storage is likely due to system load.

  • Determine whether the notification corresponds to dropped events.

JSA drops events when it cannot write events to disk. Investigate the status of storage partitions.

Insufficient Disk Space to Export Data

38750096 - Insufficient disk space to complete data export request.

Explanation

If the export directory does not contain enough space, the export of event, flow, and offense data is canceled.

User Response

Select one of the following options:

  • Free some disk space in the /store/exports directory.

  • Configure the Export Directory property in the System Settings window to use to a partition that has sufficient disk space.

  • Configure an offboard storage device.

Accumulator is Falling Behind

38750099 - The accumulator was unable to aggregate all events/flows for this interval.

Explanation

This message appears when the system is unable to accumulate data aggregations within a 60-second interval.

Every minute, JSA creates data aggregations for each aggregated search. The data aggregations are used in time-series graphs and reports and must be completed within a 60-second interval. If the count of searches and unique values in the searches are too large, the time that is required to process the aggregations might exceed 60 seconds. Time-series graphs and reports might be missing columns for the time period when the problem occurred.

You do not lose data when this problem occurs. All raw data, events, and flows are still written to disk. Only the accumulations, which are data sets that are generated from stored data, are incomplete.

User Response

The following factors might contribute to the increased workload that is causing the accumulator to fall behind:

  • Frequency of the incomplete accumulations--If the accumulation fails only once or twice a day, the drops might be caused by increased system load due to large searches, data compression cycles, or data backup.

    Infrequent failures can be ignored. If the failures occur multiple times per day, during all hours, you might want to investigate further.

  • High system load--If other processes use many system resources, the increased system load can cause the aggregations to be slow. Review the cause of the increased system load and address the cause, if possible.

    For example, if the failed accumulations occur during a large data search that takes a long time to complete, you might prevent the accumulator drops by reducing the size of the saved search.

  • Large accumulator demands--If the accumulator intervals are dropped regularly, you might need to reduce the workload.

    The workload of the accumulator is driven by the number of aggregations and the number of unique objects in those aggregations. The number of unique objects in an aggregation depends on the group-by parameters and the filters that are applied to the search.

    For example, a search that aggregates for services filters the data by using a local network hierarchy item, such as DMZ area. Grouping by IP address might result in a search that contains up to 200 unique objects. If you add destination ports to the search, and each server hosts 5 - 10 services on different ports, the new aggregate of destination.ip + destination.port can increase the number of unique objects to 2000. If you add the source IP address to the aggregate and you have thousands of remote IP addresses that hit each service, the aggregated view might have hundreds of thousands of unique values. This search creates a heavy demand on the accumulator.

    To review the aggregated views that put the highest demand on the accumulator:

    1. On the Admin tab, click Aggregated Data Management.

    2. Click the Data Written column to sort in descending order and show the largest views.

    3. Review the business case for each of the largest aggregations to see whether they are still required.

CRE Failed to Read Rules

38750107 - The last attempt to read in rules (usually due to a rule change) has failed. Please see the message details and error log for information on how to resolve this.

Explanation

The custom rules engine (CRE) on an event processor is unable to read a rule to correlate an incoming event. The notification might contain one of the following messages:

  • If the CRE was unable to read a single rule, in most cases, a recent rule change is the cause. The payload of the notification message displays the rule or rule of the rule chain that is responsible.

  • In rare circumstances, data corruption can cause a complete failure of the rule set. An application error is displayed and the rule editor interface might become unresponsive or generate more errors.

User Response

For a single rule read error, review the following options:

  • To locate the rule that is causing the notification, temporarily disable the rule.

  • Edit the rule to revert any recent changes.

  • Delete and re-create the rule that is causing the error.

For application errors where the CRE failed to read rules, contact Juniper Customer Support.

Accumulator Cannot Read the View Definition for Aggregate Data

38750108 - Accumulator: Cannot read the aggregated data view definition in order to prevent an out of sync problem. Aggregated data views can no longer be created or loaded. Time series graphs will no longer work as well as reporting.

Explanation

A synchronization issue occurred. The aggregate data view configuration that is in memory wrote erroneous data to the database.

To prevent data corruption, the system disables aggregate data views. When aggregate data views are disabled, time series graphs, saved searches, and scheduled reports display empty graphs.

User Response

Contact Juniper Customer Support.

Store and Forward Schedule Did Not Forward All Events

38750109 - A store and forward schedule finished while events were left on disk. These events will be stored on the local event collector until the next forwarding sessions begins.

Explanation

If the schedule contains a short start and end time or many events to forward, the event collector might not have sufficient time to transfer the queued events. Events are stored until the next opportunity to forward events. When the next store and forward interval occurs, the events are forwarded to the event processor.

User Response

Increase the event forwarding rate from your event collector or increase the time interval that is configured for forwarding events.

Disk Failure

38750110 - Disk Failure: Hardware Monitoring has determined that a disk is in failed state.

Explanation

On-board system tools detected that a disk failed. The notification message provides information about the failed disk and the slot or bay location of the failure.

User Response

If the notification persists, contact Juniper Customer Support or replace the parts.

Predictive Disk Failure

38750111 - Predictive Disk Failure: Hardware Monitoring has determined that a disk is in predictive failed state.

Explanation

The system monitors the status of the hardware on an hourly basis to determine when hardware support is required on the appliance.

The on-board system tools detected that a disk is approaching failure or end of life. The slot or bay location of the failure is identified.

User Response

Schedule maintenance for the disk that is in a predictive failed state.

Scan Tool Failure

38750118 - A scan has been stopped unexpectedly, in some cases this may cause the scan to be stopped.

Explanation

The system cannot initialize a vulnerability scan and asset scan results cannot be imported from external scanners. If the scan tools stop unexpectedly, the system cannot communicate with an external scanner. The system tries the connection to the external scanner five times in 30-second intervals.

In rare cases, the discovery tools encounter an untested host or network configuration.

User Response

Select one of the following options:

  • Review the configuration for external scanners in the deployment editor to ensure that the gateway IP address is correct.

  • Ensure that the external scanner can communicate through the configured IP address.

  • Ensure that the firewall rules for your DMZ are not blocking communication between your appliance and the assets you want to scan.

External Scan Gateway Failure

38750119 - An invalid/unknown gateway IP address has been supplied to the external hosted scanner, the scan has been stopped.

Explanation

When an external scanner is added, a gateway IP address is required. If the address that is configured for the scanner in the deployment editor is incorrect, the scanner cannot access your external network.

User Response

Select one of the following options:

  • Review the configuration for any external scanners that are configured in the deployment editor to ensure that the gateway IP address is correct.

  • Ensure that the external scanner can communicate through the configured IP address.

  • Ensure that the firewall rules for your DMZ are not blocking communication between your appliance and the assets you want to scan.

User Authentication Failed for Automatic Updates

38750127 - Automatic updates user authentication failed. A valid individual Juniper ID is required.

Explanation

Valid credentials are required to authorize automatic downloads from the update server.

User Response

To view the automatic update settings, on the Admin tab, click the Auto Update icon and select Change Settings >Advanced. Administrators can confirm that the user name and password in the Settings window are correct.

Aggregated Data Limit was Reached

38750130 - The aggregated data view could not be created due to an aggregated limit.

Explanation

The accumulator is a JSA process that counts and prepares events and flows in data accumulations to assist with searches, displaying charts, and report performance. The accumulator process aggregates data in pre-defined time spans to create aggregate data views. An aggregate data view is a data set that is used to draw a time series graph, and create scheduled reports.

The Console is limited to 130 active aggregate data views.

The following user actions can create a new aggregate data view:

  • New reports.

  • New saved searches that use time series data.

When the aggregate data view limit is reached, the notification is generated. As users attempt to create reports, or saved searches, they are prompted in the user interface that the system is at the limit.

User Response

To resolve this issue, administrators can review the active aggregate data views on the Admin tab in the Aggregated Data Management window. The aggregated data management feature provides information on the reports, searches by each aggregate data view. The administrator can review the list of aggregate data views to determine what data is most import to the users. Aggregate data views can be disabled to allow users to create a new rule, report, or saved search that requires an aggregate data view.

If the administrator decides to delete an aggregate data view, a summary provides an outline of the searches, rules, or reports affected. To re-create a deleted aggregate data view, the administrator needs only to re-enable or re-create the search, or report. The system automatically creates the aggregate data view based on the data required.

Magistrate is Unable to Persist Offense Updates

38750147 - Magistrate encountered serious errors that may prevent offenses from being updated.

Explanation

The system detected an exception when it wrote offense updates to the database.

Events are processed and stored, but they might not contribute to offenses.

User Response

Conduct a soft clean of the SIM data model with Deactivate offenses unchecked.

  1. Click the Admin tab.

  2. On the toolbar, click Advanced >Clean SIM Model.

  3. Click Soft Clean to set the offenses to inactive.

  4. Ensure that Deactivate offenses is not checked.

  5. Click the Are you sure you want to reset the data model? check box and click Proceed.

When you clean the SIM model, all existing offenses are closed. Cleaning the SIM model does not affect existing events and flows.