有线作
使用作仪表板解决影响交换机的问题。
当您单击“作”仪表板上的“有线”按钮时,您将看到所有可用作的列表。然后,您可以单击某个作进行进一步调查。本主题稍后将介绍可用的作。

您的订阅决定了您可以在作仪表板上看到的作。有关更多信息,请参阅 Marvis作的订阅要求。
缺少 VLAN
缺少 VLAN作表示 VLAN 配置在接入点上,但未配置在交换机端口上。因此,客户端无法在特定 VLAN 上进行通信,也无法从 DHCP 服务器获取 IP 地址。Marvis 会将接入点流量上的 VLAN 与交换机端口流量上的 VLAN 进行比较,并确定哪台设备缺少 VLAN 配置。
交换机可以是瞻博网络 EX 系列或 QFX 系列交换机,也可以是第三方交换机。
在以下示例中,Marvis 识别了两个由于缺少 VLAN 配置而看不到任何传入流量的接入点。Marvis 还可以识别缺少 VLAN 配置的特定交换机并提供端口信息,从而使您能够轻松缓解此问题。

当您看到缺少 VLAN作时,可以转到 AP 见解页面上的客户端事件部分,并检查缺少 VLAN作中报告的 VLAN 上是否存在故障。您可以验证在该 VLAN 上连接的所有客户端是否都遇到 DHCP 故障。
如果您需要了解更多信息,也可以使用左侧菜单进入交换机页面。在此处,单击交换机可查看每个端口的信息,包括 VLAN。

修复网络中的问题后,Mist AI 会对交换机进行一段时间的监控,并确保丢失的 VLAN 问题确实得到解决。因此,“缺少 VLAN”作最多可能需要 30 分钟才能自动解析。
有关缺少 VLAN作的详细信息,请观看以下视频:
Missing VLANs is a two-decade-old networking problem. It sounds so simple, but in a large enterprise it can become the ghost in the machine, as users complain their calls always drop in a certain area and conventional wisdom is, well, there must be interference or Wi-Fi issues over there. In many cases when Mist support helped troubleshoot, we found a user VLAN was indeed not provisioned on the network switch.
Hence, the user had no place to roam and the call dropped. For customers with tens of thousands of APs, this truly becomes the needle in the haystack problem. At Mist, we wanted to use AI to solve this problem, but first let's take a look at how you might start out today.
You can manually take a look, but I only have two VLANs. Or you can programmably take a look, but this makes my brain hurt. If an AP is connected to a switch port, but the user can't get an IP address or pass any traffic, then the VLAN probably isn't configured on the port or it's black holed.
The traditional way to measure a missing VLAN is to monitor traffic on the VLAN and if one VLAN continuously lacks traffic, then there's a high chance that the VLAN is missing on the switch port. The problem of this approach is false positives. Here you can see during a 24-hour window, we detected more than 33,000 APs missing one or more VLANs because they had little or no traffic, but this was not accurate as we learned that every VLAN is not created equal.
There are at least two types of special purpose VLANs that can cause detection problems. One is the black hole VLAN. Folks can create a black hole VLAN on all unconfigured ports or as a quarantine VLAN for users until they are fully authorized. This VLAN is supposed to be provisioned on the switch in case a quarantined user shows up on the AP. The second example is the over-provisioned VLAN. Larger customers use special VLANs for special sites.
For example, legacy devices might only be present at certain sites, so special VLAN should only be applicable to those sites, but because people do use automation, they want to keep their configurations consistent so they provision that VLAN across all the sites. In this case, you would expect low traffic or no traffic. Those VLANs shouldn't be flagged as missing because they were intentionally over-provisioned.
So the key for reducing false positives is to really identify the purpose of each VLAN. We could ask the customer for their own internal list, perhaps in the form of a spreadsheet, but that's very error prone. MIST developed an unsupervised machine learning model to automatically discover the purpose of each VLAN by learning from the traffic patterns on the VLANs.
In this graph, each dot represents all of the VLANs across the MIST customer base. So for each VLAN, we collect several features. How many APs lack traffic on that VLAN? How many sites lack traffic? How busy is that VLAN minute by minute from all the APs? Then we use another technique called principal component analysis to combine all of these features and map them into this two-dimensional space.
The interesting thing here is the different VLAN types, high traffic, low traffic, black hole, and over-provisioned are separated really well, even across different customers, because it turns out VLAN behavior is very similar across different customers. The beauty of this is instead of developing per customer anomaly detection tools, we actually built one model for everybody. So for any new customers, we don't have to ask them anything.
We can determine the purpose of their VLANs very quickly after they deploy. This is really the power of this multi-tenant infrastructure design. Every customer can benefit from the knowledge learned from our extended customer base.
By precisely identifying each VLAN's purpose, we reduced our initial detection rate from 33,000 plus to specifically 607 VLANs, which we believed were actually missing from the AP switch ports. For MIST, this was the moment of truth. When we were confident in the model, we contacted the customers with these 607 detected missing VLANs, and when we finally heard back, we had an astonishing 100% hit rate, no false positives.
For MIST, this was simply awesome, as there are so many mundane problems we can apply this technique to going forward. So right now, this is shown in Marvis Actions, and with a supported Juniper switch, we can provide the user specific CLI commands that we suggest they add to their config to get these missing VLANs going, with a goal to automatically doing this from the cloud as we gain their trust. And for non-Juniper switches, we give detailed info like which switch, which port, and which VLAN ID to guide them how to solve the problem that they probably didn't even know they had.
This is all built on open protocols like OpenConfig and NetConf. And lessons learned by the MIST data science team, AI solutions should first start by solving real problems, rather than deploying models and hoping for the best. Some AI vendors treat AI as a hammer in search of a nail, and this isn't going to work.
The Marvis AI engine was designed starting with human expertise and then learning over time. At MIST, each support ticket is first run through Marvis to both measure its efficacy and continue to train the model to solve the most important customer issues.
协商不完整
协商未完成作可检测交换机端口上发生自动协商失败的实例。当 Marvis 检测到设备之间的双工不匹配时,由于自动协商未能设置正确的双工模式,可能会出现此问题。Marvis 提供了有关受影响端口的详细信息。您可以检查端口和所连接设备的配置以解决此问题。
以下示例显示了“协商未完成”作的详细信息。请注意,Marvis 会列出自动协商失败的交换机和端口。

修复网络中的问题后,“协商未完成”作会在一小时内自动解决。
MTU 不匹配
Marvis 检测到交换机上的端口与直接连接到该交换机端口的设备上的端口之间的 MTU 不匹配。同一第 2 层 (L2) 网络上的所有设备必须具有相同的 MTU 大小。当发生 MTU 不匹配时,设备可能会对数据包进行分段,从而产生网络开销。
要解决问题,您需要查看交换机和已连接设备上的端口配置。下面是 Marvis 发现的 MTU 不匹配示例。 详细信息 列列出了发生不匹配的端口。

检测到环路
“检测到环路”作表示网络中存在一个环路,导致交换机接收到与发出相同的数据包。当设备之间存在多个链路时,就会发生环路。冗余链路是 L2 环路的常见原因。冗余链路用作主链路的备份链路。如果两个链路同时处于活动状态,并且生成树协议 (STP) 等协议未正确部署,则会发生交换环路。
Marvis 会确定站点中发生流量环路的确切位置,并向您显示受影响的交换机。下面是一个示例:

交换环路列在 Switch Insights 页面上的 Switch Events 下。在以下示例中,您可以看到列出了 STP 拓扑更改。

网络端口抖动
网络端口抖动作可识别持续弹回至少一小时的中继端口。例如,每分钟三次襟翼,持续一小时。配置为中继端口的端口用于作为单个中继端口或端口通道的一部分连接到其他交换机、网关或接入点。由于电缆或收发器损坏导致单向流量或 LACPDU 交换,或者连接到端口的终端设备不断重新启动,可能会发生端口抖动。以下示例显示了 Marvis作为网络端口翻动作提供的详细信息:
您可以在 Switch Insights 页面上的 Switch Events 下查看端口开启和端口关闭事件。除非抖动频率增加,否则 Marvis 不会将缓慢的端口抖动列为作。Marvis 会持续监控缓慢的端口抖动,以确定问题的严重性。如果抖动过度,Marvis 会在考虑频率和严重程度后将其列为一项作。您可以使用对话助手查看有关端口抖动缓慢的详细信息。
有关接入端口抖动的详细信息,请参阅 接入端口抖动,
您可以直接从 Marvis作页面禁用持续抖动的端口。在“网络端口抖动作”部分中,选择要禁用端口的交换机,然后单击 “禁用端口 ”按钮。
此时将显示“禁用端口”页面,其中列出了可以禁用的端口。如果端口已被禁用,则无法选择该端口(无论是之前通过“作”页面,还是从“交换机详细信息”页面手动选择)。
禁用端口时,所选端口上的端口配置将变为禁用,端口将关闭。解决问题后,您可以通过编辑交换机详细信息页面上的端口配置来重新启用这些端口。重新启用端口后,可以将设备重新连接到端口。
修复网络中的问题后,端口抖动作会在一小时内自动解决。
Looking at the switch, in this case, specifically the Juniper switch, we've introduced the action of a port flapping continuously. In this case, we do take into account a simple port down and up, which usually happens when a device connects, and this is currently reflecting a case where the port is continuously flapping, thereby not only causing a poor experience for the device which is connected on the other end, but also having high resource consumption for the switch which can be detrimental to other devices connected on the switch. Here too, we show all the required information in terms of the port, the client which is connected, and the VLAN, if in case it did communicate and we know the VLAN ID.
高 CPU
Marvis 可检测到 CPU 使用率持续较高 (> 90%) 的交换机。多种因素都可能导致 CPU 使用率过高:组播流量、网络环路、硬件问题、设备温度等。高 CPU作列出了交换机、交换机上运行的进程以及 CPU 使用率以及高使用率的原因。在以下示例中,您会看到 fxpc 进程的 CPU 使用率较高,而使用率较高的原因是交换机使用了未经认证的光学器件:

如果您看到高 CPU作,可以转到交换机的 Insights 页面,并在 Switch Charts 下分析 CPU 利用率图表。下面是一个示例:

端口卡住
端口卡住作检测交换机访问端口上的流量模式差异(如未传输或接收数据包),表示连接到该端口的客户端未正常运行。在以下示例中,您将看到 Marvis作建议您退回端口并验证客户端是否开始正常运行。请注意,除了端口号之外,Marvis 还会列出连接到端口的客户端(在本例中为摄像头)以及关联的 VLAN。

当 Marvis 检测到端口卡滞问题时,它会启动自动端口跳回来修复问题。如果自动端口退回无法解决问题,Marvis 会将其列为作。您可以在 Switch Insights 页面上的 Switch Events 下查看自动退回作,如以下示例所示。右侧的图表显示了端口退回前后的流量。您将看到,在端口退回之前,仅看到 Tx 流量(以绿色表示)。端口退回后,您还会看到 Rx 流量。
默认情况下,端口卡住作的自我驱动型功能处于启用状态。有关自我驱动型功能的信息,请参阅 自我驱动型 Marvis作。

流量异常
Marvis 检测到交换机上的广播和组播流量异常下降或增加。它还可以检测到任何异常高的传输或接收错误。与连接故障的“异常检测”视图一样,“详细信息”视图显示时间线、异常描述以及受影响端口的详细信息。如果问题影响到整个站点,Marvis 会显示受影响交换机的详细信息以及每台受影响交换机的端口详细信息。

Marvis, our AI-powered virtual network assistant, employs an actions framework to automatically identify network problems and anomalies that are likely impacting user experience. This helps you to significantly reduce mean time to resolution. Marvis can detect switched traffic anomalies, such as traffic storms or abnormal high TxRx count, with respect to broadcast, unknown, unicast, or multicast traffic.
It uses our third generation of algorithms, including long short-term memory, or LSTM for short, to boost efficacy and eliminate false positives. Visit the link below to learn more.
端口配置错误
当一台交换机连接到另一台交换机时,通信需要端口上的通用属性。为了检测配置不当,Marvis 会比较上行链路端口的以下属性:
-
速度
-
双工
-
本机 VLAN
-
允许的 VLAN
-
MTU
-
端口模式(两个端口“接入”或两个端口“中继”)
-
STP 模式(两个端口均为“转发”)
在“作”仪表板上 ,单击“ 切换> 配置错误的端口 ”,以在屏幕下角查看问题和建议的作。
单击“ 查看更多 ”链接以查看 MAC 地址和端口。
脱机开关
Marvis 可检测与 Juniper Mist 云断开连接的交换机。交换机可能由于多种原因而脱机,其中包括:
-
电源问题
-
电缆故障
-
所需的防火墙端口未打开
-
配置不正确
当交换机脱机时,Marvis 会监控交换机以检查脱机状态的持续时间。如果交换机脱机时间超过 3 分钟,Marvis 将生成“开关脱机”作。请注意,一旦交换机脱机,Switch Insights 页面上的 Switch Offline 基础架构警报和事件就会显示出来。
下面是一个示例,显示了“离线切换 Marvis”作。单击 “查看更多 ”链接可查看脱机交换机的详细信息。如果单击交换机名称,则可以查看 Insights 页面,您可以在其中查看 Switch Events (交换机事件) 下列出的事件。
要对处于脱机状态的交换机进行故障排除,请参阅 对交换机连接进行故障排除。