交换机操作
总结 使用操作仪表板解决影响交换机的问题。
当您单击操作仪表板上的“切换”按钮时,您将看到所有可用操作的列表。然后,您可以单击某个操作进行进一步调查。本主题稍后将介绍可用的操作。
您的订阅决定了您可以在操作仪表板上看到的操作。有关更多信息,请参阅 Marvis 操作的订阅要求。
缺少 VLAN
缺少 VLAN 操作表示 VLAN 配置在接入点上,但未配置在交换机端口上。因此,客户端无法在特定 VLAN 上进行通信,也无法从 DHCP 服务器获取 IP 地址。Marvis 会将接入点流量上的 VLAN 与交换机端口流量上的 VLAN 进行比较,并确定哪台设备缺少 VLAN 配置。
在以下示例中,Marvis 识别了两个由于缺少 VLAN 配置而看不到任何传入流量的接入点。Marvis 还可以识别缺少 VLAN 配置的特定交换机并提供端口信息,从而使您能够轻松缓解此问题。
如果您需要了解更多信息,也可以使用左侧菜单进入交换机页面。在此处,单击交换机可查看每个端口的信息,包括 VLAN。
修复网络中的问题后,Mist AI 会对交换机进行一段时间的监控,并确保丢失的 VLAN 问题确实得到解决。因此,“缺少 VLAN”操作最多可能需要 30 分钟才能自动解决并显示在“最新更新”部分中。
有关缺少 VLAN 操作的详细信息,请观看以下视频:
Missing VLANs is a two-decade-old networking problem. It sounds so simple, but in a large enterprise it can become the ghost in the machine, as users complain their calls always drop in a certain area and conventional wisdom is, well, there must be interference or Wi-Fi issues over there. In many cases when Mist support helped troubleshoot, we found a user VLAN was indeed not provisioned on the network switch.
Hence, the user had no place to roam and the call dropped. For customers with tens of thousands of APs, this truly becomes the needle in the haystack problem. At Mist, we wanted to use AI to solve this problem, but first let's take a look at how you might start out today.
You can manually take a look, but I only have two VLANs. Or you can programmably take a look, but this makes my brain hurt. If an AP is connected to a switch port, but the user can't get an IP address or pass any traffic, then the VLAN probably isn't configured on the port or it's black holed.
The traditional way to measure a missing VLAN is to monitor traffic on the VLAN and if one VLAN continuously lacks traffic, then there's a high chance that the VLAN is missing on the switch port. The problem of this approach is false positives. Here you can see during a 24-hour window, we detected more than 33,000 APs missing one or more VLANs because they had little or no traffic, but this was not accurate as we learned that every VLAN is not created equal.
There are at least two types of special purpose VLANs that can cause detection problems. One is the black hole VLAN. Folks can create a black hole VLAN on all unconfigured ports or as a quarantine VLAN for users until they are fully authorized. This VLAN is supposed to be provisioned on the switch in case a quarantined user shows up on the AP. The second example is the over-provisioned VLAN. Larger customers use special VLANs for special sites.
For example, legacy devices might only be present at certain sites, so special VLAN should only be applicable to those sites, but because people do use automation, they want to keep their configurations consistent so they provision that VLAN across all the sites. In this case, you would expect low traffic or no traffic. Those VLANs shouldn't be flagged as missing because they were intentionally over-provisioned.
So the key for reducing false positives is to really identify the purpose of each VLAN. We could ask the customer for their own internal list, perhaps in the form of a spreadsheet, but that's very error prone. MIST developed an unsupervised machine learning model to automatically discover the purpose of each VLAN by learning from the traffic patterns on the VLANs.
In this graph, each dot represents all of the VLANs across the MIST customer base. So for each VLAN, we collect several features. How many APs lack traffic on that VLAN? How many sites lack traffic? How busy is that VLAN minute by minute from all the APs? Then we use another technique called principal component analysis to combine all of these features and map them into this two-dimensional space.
The interesting thing here is the different VLAN types, high traffic, low traffic, black hole, and over-provisioned are separated really well, even across different customers, because it turns out VLAN behavior is very similar across different customers. The beauty of this is instead of developing per customer anomaly detection tools, we actually built one model for everybody. So for any new customers, we don't have to ask them anything.
We can determine the purpose of their VLANs very quickly after they deploy. This is really the power of this multi-tenant infrastructure design. Every customer can benefit from the knowledge learned from our extended customer base.
By precisely identifying each VLAN's purpose, we reduced our initial detection rate from 33,000 plus to specifically 607 VLANs, which we believed were actually missing from the AP switch ports. For MIST, this was the moment of truth. When we were confident in the model, we contacted the customers with these 607 detected missing VLANs, and when we finally heard back, we had an astonishing 100% hit rate, no false positives.
For MIST, this was simply awesome, as there are so many mundane problems we can apply this technique to going forward. So right now, this is shown in Marvis Actions, and with a supported Juniper switch, we can provide the user specific CLI commands that we suggest they add to their config to get these missing VLANs going, with a goal to automatically doing this from the cloud as we gain their trust. And for non-Juniper switches, we give detailed info like which switch, which port, and which VLAN ID to guide them how to solve the problem that they probably didn't even know they had.
This is all built on open protocols like OpenConfig and NetConf. And lessons learned by the MIST data science team, AI solutions should first start by solving real problems, rather than deploying models and hoping for the best. Some AI vendors treat AI as a hammer in search of a nail, and this isn't going to work.
The Marvis AI engine was designed starting with human expertise and then learning over time. At MIST, each support ticket is first run through Marvis to both measure its efficacy and continue to train the model to solve the most important customer issues.
协商不完整
协商未完成操作可检测交换机端口上发生自动协商失败的实例。当 Marvis 检测到设备之间的双工不匹配时,由于自动协商未能设置正确的双工模式,可能会出现此问题。Marvis 提供了有关受影响端口的详细信息。您可以检查端口和所连接设备的配置以解决此问题。
以下示例显示了“协商未完成”操作的详细信息。请注意,Marvis 会列出自动协商失败的交换机和端口。
修复网络中的问题后,“协商未完成”操作会在一小时内自动解决并显示在“最新更新”部分中。
MTU 不匹配
Marvis 检测到交换机上的端口与直接连接到该交换机端口的设备上的端口之间的 MTU 不匹配。同一第 2 层 (L2) 网络上的所有设备必须具有相同的 MTU 大小。当发生 MTU 不匹配时,设备可能会对数据包进行分段,从而产生网络开销。 详细信息 列列出了发生不匹配的端口。
要解决问题,您需要查看交换机和已连接设备上的端口配置。下面是 Marvis 发现的 MTU 不匹配示例。
检测到环路
“检测到环路”操作表示网络中存在一个环路,导致交换机接收到与发出相同的数据包。当设备之间存在多个链路时,就会发生环路。冗余链路是 L2 环路的常见原因。冗余链路用作主链路的备份链路。如果两个链路同时处于活动状态,并且生成树协议 (STP) 等协议未正确部署,则会发生交换环路。
Marvis 会确定站点中发生流量环路的确切位置,并向您显示受影响的交换机。下面是一个示例:
网络端口抖动
网络端口抖动操作可识别持续弹回至少一小时的中继端口。例如,每分钟三次襟翼,持续一小时。配置为中继端口的端口用于作为单个中继端口或端口通道的一部分连接到其他交换机、网关或接入点。由于电缆或收发器损坏导致单向流量或 LACPDU 交换,或者连接到端口的终端设备不断重新启动,可能会发生端口抖动。以下示例显示了 Marvis 操作为网络端口翻动操作提供的详细信息:

您可以直接从 Marvis 操作页面禁用持续抖动的端口。在“网络端口抖动操作”部分中,选择要禁用端口的交换机,然后单击 “禁用端口 ”按钮。
此时将显示“禁用端口”页面,其中列出了可以禁用的端口。如果端口已被禁用,则无法选择该端口(无论是之前通过“操作”页面,还是从“交换机详细信息”页面手动选择)。
禁用端口时,所选端口上的端口配置将变为禁用,端口将关闭。解决问题后,您可以通过编辑交换机详细信息页面上的端口配置来重新启用这些端口。重新启用端口后,可以将设备重新连接到端口。
修复网络中的问题后,端口抖动操作会自动解决,并在一小时内显示在“最新更新”部分中。
Looking at the switch, in this case, specifically the Juniper switch, we've introduced the action of a port flapping continuously. In this case, we do take into account a simple port down and up, which usually happens when a device connects, and this is currently reflecting a case where the port is continuously flapping, thereby not only causing a poor experience for the device which is connected on the other end, but also having high resource consumption for the switch which can be detrimental to other devices connected on the switch. Here too, we show all the required information in terms of the port, the client which is connected, and the VLAN, if in case it did communicate and we know the VLAN ID.
高 CPU
Marvis 可检测到 CPU 占用率始终较高的交换机。多种因素都可能导致 CPU 使用率过高:组播流量、网络环路、硬件问题、设备温度等。高 CPU 操作列出了交换机、交换机上运行的进程以及 CPU 使用率以及高使用率的原因。在以下示例中,您会看到 fxpc 进程的 CPU 使用率较高,而使用率较高的原因是交换机使用了未经认证的光学器件:
端口卡住
端口卡住操作可检测交换机端口上的流量模式差异(如未传输或接收数据包),表示连接到端口的客户端未正常运行。在以下示例中,您将看到 Marvis 操作建议您退回端口并验证客户端是否开始正常运行。请注意,除了端口号之外,Marvis 还会列出连接到端口的客户端(在本例中为摄像头)以及关联的 VLAN。
流量异常
Marvis 检测到交换机上的广播和组播流量异常下降或增加。它还可以检测到任何异常高的传输或接收错误。与连接故障的“异常检测”视图一样,“详细信息”视图显示时间线、异常描述以及受影响端口的详细信息。如果问题影响到整个站点,Marvis 会显示受影响交换机的详细信息以及每台受影响交换机的端口详细信息。
Marvis, our AI-powered virtual network assistant, employs an actions framework to automatically identify network problems and anomalies that are likely impacting user experience. This helps you to significantly reduce mean time to resolution. Marvis can detect switched traffic anomalies, such as traffic storms or abnormal high TxRx count, with respect to broadcast, unknown, unicast, or multicast traffic.
It uses our third generation of algorithms, including long short-term memory, or LSTM for short, to boost efficacy and eliminate false positives. Visit the link below to learn more.
端口配置错误
当一台交换机连接到另一台交换机时,通信需要端口上的通用属性。为了检测配置不当,Marvis 会比较以下属性:
-
速度
-
双工
-
本机 VLAN
-
允许的 VLAN
-
MTU
-
端口模式(两个端口“接入”或两个端口“中继”)
-
STP 模式(两个端口均为“转发”)
在“操作”仪表板上 ,单击“ 切换> 配置错误的端口 ”,以在屏幕下角查看问题和建议的操作。

单击“ 查看更多 ”链接以查看 MAC 地址和端口。
