Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

 
 

Marvis Actions: An Insight into Backend Operations

Take a closer look at the factors that Marvis uses to identify key issues and to categorize these issues as Marvis actions.

Marvis uses data from statistics and events to identify user-impacting issues pertaining to wired, WAN, and wireless connectivity for both pre-connections and post-connections.

Glossary of Terms

Term Definition
Model input feature The inputs or features that the model consumes to determine whether the condition for generating the specific action is met.
Trigger conditions The conditions that trigger the model to create Marvis actions.
Validation time The time taken for Marvis to mark an open Marvis action as resolved. A user may have fixed the issue or the issue is no longer applicable as the symptoms leading to the Marvis action are not observed anymore.

Layer 1 Actions

Marvis Action Model Input Feature Trigger Conditions Validation Time
Bad Cable AP, switch, or WAN Edge statistics, events Speed changes, errors reported on ports, and frequent disconnections and restarts over the monitored period. 7 days

Connectivity Actions

Marvis Action Model Input Feature Trigger Conditions Validation Time
Authentication Failure Wired and Wireless clients

Deviations from the predicted baseline. The LSTM-based model baselines authentication success or failure events across the site.

The model considers the severity of the issue to generate this Marvis action. The higher the severity and deviation from the baseline, the higher the confidence of the model to generate this action within the observed time duration.

1 day
DHCP Failure Wired and Wireless clients

Deviations from the predicted baseline. The LSTM-based model baselines Dynamic Host Configuration Protocol (DHCP) success or failure events across the site.

The model considers the severity of the issue to generate this Marvis action. The higher the severity and deviation from the baseline, the higher the confidence of the model to generate this action within the observed time duration.

1 day
ARP Failure Wired and Wireless clients

Deviations from the predicted baseline. The LSTM-based model baselines Address Resolution Protocol (ARP) success or failure events across the site.

The model considers the severity of the issue to generate this Marvis action. The higher the severity and deviation from the baseline, the higher the confidence of the model to generate this action within the observed time duration.

1 day
DNS Failure Wired and Wireless clients

Deviations from the predicted baseline. The LSTM-based model baselines Domain Name System (DNS) success or failure events across the site.

The model considers the severity of the issue to generate this Marvis action. The higher the severity and deviation from the baseline, the higher the confidence of the model to generate this action within the observed time duration.

1 day

AP Actions

Marvis Action Model Input Feature Trigger Conditions Validation Time
Offline AP statistics

One AP or multiple APs are locally up or down (loss of cloud connectivity only).

The model correlates to identify the cause for the AP being down—that is, if the issue is due to a switch, site, region, or ISP outage.

If you want to be notified immediately or within a few minutes of the device going down, configure infrastructure alerts for device up or down events and specify a threshold.

15 minutes
Health Check Failed AP statistics

AP or radios remain repeatedly inoperable after autorecovery.

30 days
Non-Compliant AP statistics

Difference in firmware version on an AP or multiple APs from that in the version compliance settings configured under site settings.

30 minutes
Coverage Hole AP and client statistics

Anomaly in the SLE baseline caused due to repeated low RSSI reported by all clients associated with an AP or multiple APs in a high-impact area.

The model considers the recurrence of the issue and fringe pattern awareness in the case of outdoor APs or APs located at the building entry or exit.

The model considers the strength of the anomaly to generate the Marvis action to indicate a user-impacting coverage-hole issue. If the anomaly index is strong, the model generates the action faster than when the anomaly index is weak. The model examines multiple batches of data to identify APs for coverage-hole issues.

7 days
Insufficient Capacity AP and client statistics

Anomaly in the baseline caused by APs with repeated and prolonged capacity constraints that are not seasonal in nature.

The model factors the anomaly strength to generate the Marvis action to indicate a user-impacting capacity issue. If the anomaly index is strong, the model generates the action faster than when the anomaly index is weak. The model examines multiple batches of data to identify APs for capacity issues.

7 days
AP Loop Detected AP events

Reflection events on an AP triggered by network loops caused due to misconfiguration or incorrect configuration.

Reflection events occur when an AP receives the packet it sent on the same or different VLAN.

Reflection events are generated almost immediately under site events, enabling you to monitor these events for raw statistics-based tracking.

30 minutes

Switch Actions

Marvis Action Model Input Feature Trigger Conditions Validation Time
Missing VLAN AP port statistics

Uplink port statistics reported by an AP missing a VLAN.

This action correlates data from two or more APs to determine whether an active VLAN used by clients is missing on the AP port. This correlation helps prevent generation of the Missing VLAN action if a VLAN is unused by any client across the entire site.

30 minutes
Negotiation Incomplete Individual switch port statistics

Autonegotiation failure reported on the switch ports.

Up to 60 minutes
MTU Mismatch Individual switch port statistics

MTU mismatch between any switch port and connected devices. The reported statistics indicate errors on the port.

The model considers the severity and time to generate the Marvis action. The greater the MTU mismatch, the greater the severity, resulting in faster generation of the Marvis action.

1 day
Loop Detected Switch port events

An intentionally or unintentionally introduced loop in the topology resulting in rapid and repeated Spanning Tree Protocol (STP) topology changes.

The model uses the STP topology changes event as an input feature and considers the severity and time. The higher the frequency of STP topology changes in each period, the faster the detection.

Alternatively, a loop causing events at a slower pace for a longer duration also triggers the Marvis action.

30 minutes
Network Port Flap Switch ports events (trunk port only)

Consistent port bounce on a port configured as a trunk port.

The model considers the frequency and time. The higher the frequency of port flaps, the higher the severity of the issue. For slow port flaps that occur for a longer duration, the model detects the port flaps within a couple of hours or a few days.

30 minutes
High CPU Switch chassis statistics

Average CPU utilization consistently greater than 90% for the monitored duration.

The model considers the frequency and duration of the issue. Statistics that show high average CPU utilization for every sample in the monitored dataset indicate a severe user-impacting issue. The model generates the Marvis action quickly for such an issue.

30 minutes
Port Stuck Switch port statistics

Sudden deviation in traffic patterns for end devices on access ports.

The model does not generate false positives for recurring seasonal traffic patterns. It also considers traffic patterns across similar endpoints for inference.

.

This Marvis action is self-driving. When a port stuck issue is detected, the port is automatically bounced to operationalize the endpoint again. The model generates the action when the automatic port bounce fails to bring the endpoint back into operation or if the model detects the port stuck issue multiple times.

30 minutes
Traffic Anomaly Switch port statistics

Any deviation in broadcast and multicast frame counters from the predicted traffic patterns.

The model baselines traffic patterns on each switch or switch port every couple of days. This action uses the long short-term memory (LSTM)-based model.

The model generates this Marvis action based on the severity of the issue. For strong deviations that last for the entire monitored duration, the model generates the action quickly. The model might take longer to generate actions for minor, longer-lasting deviations.

1 day
Misconfigured Port Uplink switch port statistics

MTU, VLAN, mode, or duplex mismatches between identified uplink ports.

The model identifies discrepancies on the switch-switch connections at the edge.
60 minutes

WAN Edge Actions

Marvis Action Model Input Feature Trigger Conditions Validation Time
MTU Mismatch WAN Edge statistics

MTU mismatch between a WAN Edge port and connected devices. The model examines the reported statistics that indicate certain errors on the port.

The model considers the severity and time to generate this Marvis action. The greater the MTU mismatch, the greater the severity, and the action is generated within a specific time duration.

30 minutes
Bad WAN Uplink Uplink ports on WAN Edges

High latency, packet drops, congestion, and network service failures such as ARP or DHCP reported in the WAN port statistics, indicating a change in the baseline behavior.

Issues determined as high-severity issues are listed sooner than the low-severity issues.

1 day
VPN Path Down VPN tunnels or peer paths

Peer-path down issue in either of the following paths:

  • Paths originating from a spoke toward a specific hub

  • Paths terminating at a hub

Subscribe to the critical port monitoring alert for raw alerting if your requirement is to get alerts on every port up or port down scenario.

Issues determined as high-severity issues are listed sooner than the low-severity issues.

1 hour
Non-Compliant SRX Series Firewall

Difference in Junos OS version on the primary and backup partitions.

30 minutes

Other Marvis Actions

Marvis Action Model Input Feature Trigger Conditions Validation Time
Persistently Failing Clients Wired and Wireless clients

Clients continuously failing to authenticate and connect to the network. Persistent failures are observed continuously during the monitored time frame.

The trigger time is dependent on the site—that is, the number of clients and correlated simultaneous failures.

60 minutes
Access Port Flap Access ports on a switch

Consistent port up or port down events for a port configured as an access port.

The model considers the frequency and duration of the issue. The higher the frequency of port flaps, the higher the severity of the issue. For slow port flaps that occur for a longer duration, the model detects the port flaps within a couple of hours or a few days.

30 minutes