在此页面上
多节点高可用性中的软件升级
概述
在多节点高可用性设置中,您可以在两个不同的 Junos OS 版本之间升级 SRX 系列防火墙,同时将流量中断降至最低。
我们支持使用 CLI 的软件升级方法,如 Junos OS 22.3R1 版中所示。
从 Junos OS 版本到 | Junos OS 版本 | 使用软件升级方法 |
---|---|---|
20.4 | 20.4 之后的任何版本 | 不 |
22.3 | 下一版本的 Junos OS 版本 | 是的 |
有关 Junos OS 版本的升级和降级支持的信息,请参阅发行说明中的 Junos OS 版本和延长生命周期终止版本的升级和降级支持策略 。
将多节点高可用性中的 SRX 系列防火墙升级到 Junos OS 22.4R1 版或更高版本(从早期 Junos OS 版本)时,可以使用 隔离节点升级过程。Junos OS 22.4R1 及更高版本与早期 Junos OS 版本不兼容,无法在常规升级期间同步会话。
将 SRX 系列防火墙从 Junos OS 22.3 版升级到下一版本的 Junos OS 版本时,可能会遇到一些流量中断。
在多节点高可用性设置中升级 SRX 系列防火墙上的 Junos OS 版本时,尽管升级过程成功完成,命令输出中 show chassis high-availability information
会显示以下消息:
Peer Hardware Incompatible: SPU SLOT MISMATCH
从 Junos OS 21.4R1 版升级到 21.4R1 之后的任何 Junos OS 发行版时,将显示上述消息。
您必须在多节点高可用性设置中的两个 SRX 系列防火墙上安装相同版本的 Junos OS。因此,在一台设备上升级 Junos OS 时,请确保将另一台设备也升级到相同版本。
我们在多节点高可用性设置中支持以下升级方法:
-
对于第 3 层部署:配置
install-on-failure-route
(推荐)。在此方法中,您可以通过更改路由来转移流量。在这里,流量仍可以通过节点,并且接口保持打开状态。有关详细信息,请参阅 使用失败时安装路由升级软件 。您还可以将shutdown-on-failure
接口方法用于第 3 层部署。 -
对于混合部署和默认网关(第 2 层/交换)部署:
shutdown-on-failure
接口方法。在此方法中,您可以通过关闭节点上的接口来转移流量。在这里,流量无法通过节点。有关详细信息,请转至 使用故障关机界面升级软件 。.
在以下过程中,我们将向您展示如何使用 CLI 将两个 SRX 系列防火墙(SRX-01 和 SRX-02)从 Junos OS 版本 22.3R1.1 升级到 Junos OS 版本 22.3R1.3。为避免在多节点高可用性设置中升级 SRX 系列防火墙时停机,我们将一次更新一台设备。
升级 Junos OS 的最佳实践
规划软件升级时,请考虑以下最佳做法:
- 确保两个节点都联机且具有相同版本的 Junos OS。
- 使用 准备软件安装和升级 (Junos OS) 中提供的清单准备 SRX 系列防火墙以进行升级。
- 使用
show system storage
命令检查两个节点在 /var 文件系统中是否有足够的存储空间。 -
使用
show chassis fpc pic-status
命令检查两个设备上所有卡的状态。 -
使用
show chassis alarms
命令验证设备上没有重大警报。 - 确保没有未提交的更改。
- 备份活动配置和许可证密钥。
我们建议您在维护时段内执行软件升级。
预安装步骤
在开始软件升级之前,请完成以下任务。
- 检查设备上当前的 Junos OS 软件版本。
user@host> show version
Hostname: srx-01 Model: vSRX Junos: 22.3R1.1 - 从两个 SRX 系列防火墙上的 瞻博网络支持 页面下载 Junos OS 映像,并将其保存在 /var/tmp 位置。
-
使用显示机箱高可用性信息命令验证您的多节点高可用性设置是否正常运行,以及机箱间链路 (ICL) 是否已启动。
在 SRX-01 设备上
user@srx-01> show chassis high-availability information
Node failure codes: HW Hardware monitoring LB Loopback monitoring MB Mbuf monitoring SP SPU monitoring CS Cold Sync monitoring SU Software Upgrade Node Status: ONLINE Local-id: 1 Local-IP: 10.22.0.1 HA Peer Information: Peer Id: 2 IP address: 10.22.0.2 Interface: ge-0/0/2.0 Routing Instance: default Encrypted: YES Conn State: UP Cold Sync Status: COMPLETE Services Redundancy Group: 0 Current State: ONLINE Peer Information: Peer Id: 2 SRG failure event codes: BF BFD monitoring IP IP monitoring IF Interface monitoring CP Control Plane monitoring Services Redundancy Group: 1 Deployment Type: ROUTING Status: ACTIVE Activeness Priority: 200 Preemption: ENABLED Process Packet In Backup State: NO Control Plane State: READY System Integrity Check: N/A Failure Events: NONE Peer Information: Peer Id: 2 Status : BACKUP Health Status: HEALTHY Failover Readiness: READY在 SRX-02 设备上
user@srx-02> show chassis high-availability information
Node failure codes: HW Hardware monitoring LB Loopback monitoring MB Mbuf monitoring SP SPU monitoring CS Cold Sync monitoring SU Software Upgrade Node Status: ONLINE Local-id: 2 Local-IP: 10.22.0.2 HA Peer Information: Peer Id: 1 IP address: 10.22.0.1 Interface: ge-0/0/2.0 Routing Instance: default Encrypted: YES Conn State: UP Cold Sync Status: COMPLETE Services Redundancy Group: 0 Current State: ONLINE Peer Information: Peer Id: 1 SRG failure event codes: BF BFD monitoring IP IP monitoring IF Interface monitoring CP Control Plane monitoring Services Redundancy Group: 1 Deployment Type: ROUTING Status: BACKUP Activeness Priority: 1 Preemption: DISABLED Process Packet In Backup State: NO Control Plane State: READY System Integrity Check: COMPLETE Failure Events: NONE Peer Information: Peer Id: 1 Status : ACTIVE Health Status: HEALTHY Failover Readiness: N/A这些输出示例确认多节点高可用性设置中的两个 SRX 系列防火墙处于正常状态且运行正常。
您现在可以继续软件升级了。
使用故障安装路由升级软件
分流过境交通的先决条件
如 在第 3 层网络中配置多节点高可用性中所述,检查您的设备是否具有通过更改路由来转移传输流量所需的配置。如果尚未配置,请使用以下步骤:
-
为用于在升级期间转移流量的路由创建专用的自定义虚拟路由器。
set routing-instances MNHA-signal-routes instance-type virtual-router
install-on-failure-route
为 SRG0 配置语句。此处,您已将 IP 地址为 10.39.1.3 的路由配置为节点发生故障时要安装的路由。set routing-instances MNHA-signal-routes instance-type virtual-router set chassis high-availability services-redundancy-group 0 install-on-failure-route 10.39.1.3 routing-instance MNHA-signal-routes set chassis high-availability services-redundancy-group 1 active-signal-route 10.39.1.1 routing-instance MNHA-signal-routes set chassis high-availability services-redundancy-group 1 backup-signal-route 10.39.1.2 routing-instance MNHA-signal-routes
当节点发生故障时,路由表将安装语句中提到的路由。
- 配置匹配的路由策略,并根据路由的存在定义策略条件。此处将路由 10.39.1.3 作为 的
if-route-exists
路由匹配条件包含在内。set policy-options condition active_route_exists if-route-exists address-family inet 10.39.1.1/32 set policy-options condition active_route_exists if-route-exists address-family inet table MNHA-signal-routes.inet.0 set policy-options condition backup_route_exists if-route-exists address-family inet 10.39.1.2/32 set policy-options condition backup_route_exists if-route-exists address-family inet table MNHA-signal-routes.inet.0 set policy-options condition failure_route_exists if-route-exists address-family inet 10.39.1.3/32 set policy-options condition failure_route_exists if-route-exists address-family inet table MNHA-signal-routes.inet.0
创建策略语句以将条件引用为匹配术语之一。
set policy-options policy-statement mnha-route-policy term 4 from protocol static set policy-options policy-statement mnha-route-policy term 4 from protocol direct set policy-options policy-statement mnha-route-policy term 4 from condition failure_route_exists set policy-options policy-statement mnha-route-policy term 4 then metric 100 set policy-options policy-statement mnha-route-policy term 4 then accept
升级多节点高可用性软件
让我们升级充当备份节点的设备 (SRX-02)。
- 启动软件升级过程并提交配置。
user@srx-02# set chassis high-availability software-upgrade
此命令启动 SRG0 的本地故障,并将 SRG1(如果已配置)转换为
INELIGIBLE
本地设备上的状态。对于 SRG1,对等设备现在转换到或保持活动状态。在本地节点上,SRG1 的活动和备用信号路由将被移除。如果已配置语句install-on-failure-route
,则会安装与install-on-failure-route
配置关联的信号路由。借助适当的路由策略,本地设备可以通告更高的路由指标,并将流量从本地设备转移出去,并将流量引导到对等设备, - 验证多节点高可用性的状态。
user@srx-02> show chassis high-availability information Node failure codes: HW Hardware monitoring LB Loopback monitoring MB Mbuf monitoring SP SPU monitoring CS Cold Sync monitoring SU Software Upgrade Node Status: OFFLINE [ SU ] Local-id: 1 Local-IP: 10.22.0.1 HA Peer Information: Peer Id: 2 IP address: 10.22.0.2 Interface: ge-0/0/2.0 Routing Instance: default Encrypted: YES Conn State: UP Cold Sync Status: COMPLETE Services Redundancy Group: 0 Current State: ONLINE Peer Information: Peer Id: 2 SRG failure event codes: BF BFD monitoring IP IP monitoring IF Interface monitoring CP Control Plane monitoring Services Redundancy Group: 1 Deployment Type: ROUTING Status: INELIGIBLE Activeness Priority: 200 Preemption: ENABLED Process Packet In Backup State: NO Control Plane State: N/A System Integrity Check: IN PROGRESS Failure Events: NONE Peer Information: Peer Id: 2 Status : ACTIVE Health Status: HEALTHY Failover Readiness: N/A
输出显示
Node Status: OFFLINE [ SU ]
,表示节点已准备好进行软件升级。您可以看到 SRG1 的状态已更改为INELIGIBLE
。 - 确认其他设备 (SRX-01) 处于活动角色且运行正常。
user@srx-01> show chassis high-availability informationNode failure codes: HW Hardware monitoring LB Loopback monitoring MB Mbuf monitoring SP SPU monitoring CS Cold Sync monitoring SU Software Upgrade Node Status: ONLINE Local-id: 2 Local-IP: 10.22.0.2 HA Peer Information: Peer Id: 1 IP address: 10.22.0.1 Interface: ge-0/0/2.0 Routing Instance: default Encrypted: YES Conn State: UP Cold Sync Status: COMPLETE Services Redundancy Group: 0 Current State: ONLINE Peer Information: Peer Id: 1 SRG failure event codes: BF BFD monitoring IP IP monitoring IF Interface monitoring CP Control Plane monitoring Services Redundancy Group: 1 Deployment Type: ROUTING Status: ACTIVE Activeness Priority: 1 Preemption: DISABLED Process Packet In Backup State: NO Control Plane State: READY System Integrity Check: N/A Failure Events: NONE Peer Information: Peer Id: 1 Status : INELIGIBLE Health Status: UNHEALTHY Failover Readiness: NOT READY
命令输出显示 SRG1 的状态为活动。
另请注意,在 SRG1 的部分下
Peer Information
,状态为INELIGIBLE
指示另一个节点处于不合格状态。 - 在 SRX-02 设备上安装 Junos OS 软件。
user@srx-02> request system software add /var/tmp/junos-install-vsrx3-x86-64-22.3R1.3.tgz no-copy
- 成功安装后,使用命令重新启动
request system reboot
设备。 - 重新启动后,使用命令检查
show version
Junos OS 版本。user@srx-02> show version
Hostname: srx-02 Model: vSRX Junos: 22.3R1.3输出确认设备已升级到正确的 Junos OS 版本。
- 检查设备上多节点高可用性的状态。
user@srx-02> show chassis high-availability information
Node failure codes: HW Hardware monitoring LB Loopback monitoring MB Mbuf monitoring SP SPU monitoring CS Cold Sync monitoring SU Software Upgrade Node Status: OFFLINE [ SU ] Local-id: 1 Local-IP: 10.22.0.1 HA Peer Information: Peer Id: 2 IP address: 10.22.0.2 Interface: ge-0/0/2.0 Routing Instance: default Encrypted: YES Conn State: UP Cold Sync Status: COMPLETE Services Redundancy Group: 0 Current State: ONLINE Peer Information: Peer Id: 2 SRG failure event codes: BF BFD monitoring IP IP monitoring IF Interface monitoring CP Control Plane monitoring Services Redundancy Group: 1 Deployment Type: ROUTING Status: INELIGIBLE Activeness Priority: 200 Preemption: ENABLED Process Packet In Backup State: NO Control Plane State: N/A System Integrity Check: COMPLETE Failure Events: NONE Peer Information: Peer Id: 2 Status : ACTIVE Health Status: HEALTHY Failover Readiness: N/A输出继续将节点状态
OFFLINE [ SU ]
显示为,SRG1 状态INELIGIBLE
显示为 。 software-upgrade
删除语句并提交配置。user@srx-02# delete chassis high-availability software-upgrade
删除
software-upgrade
语句时,将删除本地故障状态和已安装的路由。-
再次检查多节点高可用性状态,以确认设备处于联机状态,并且整体状态为正常且正常运行。
user@srx02> show chassis high-availability information Node failure codes: HW Hardware monitoring LB Loopback monitoring MB Mbuf monitoring SP SPU monitoring CS Cold Sync monitoring SU Software Upgrade Node Status: ONLINE Local-id: 1 Local-IP: 10.22.0.1 HA Peer Information: Peer Id: 2 IP address: 10.22.0.2 Interface: ge-0/0/2.0 Routing Instance: default Encrypted: YES Conn State: UP Cold Sync Status: COMPLETE Services Redundancy Group: 0 Current State: ONLINE Peer Information: Peer Id: 2 SRG failure event codes: BF BFD monitoring IP IP monitoring IF Interface monitoring CP Control Plane monitoring Services Redundancy Group: 1 Deployment Type: ROUTING Status: BACKUP Activeness Priority: 200 Preemption: ENABLED Process Packet In Backup State: NO Control Plane State: READY System Integrity Check: IN PROGRESS Failure Events: NONE Peer Information: Peer Id: 2 Status : ACTIVE Health Status: HEALTHY Failover Readiness: N/A
输出显示
Node Status: ONLINE
和 SRG1 状态BACKUP
为 ,表示节点已重新联机并以备份角色正常运行。 -
检查接口、路由协议、播发的路由等,以确认您的设置正常运行。
现在,您可以使用相同的过程继续升级其他设备 (SRX-01)。
如果您遇到任何问题并且无法完成升级,则可以回滚设备上的软件,然后重新启动系统。使用命令 request system software rollback
恢复以前安装的软件版本。
使用故障关机界面升级软件
转移过境交通的先决条件
检查您的 SRX 系列是否包含通过关闭接口来隔离流量所需的配置,如 在默认网关部署中配置多节点高可用性中所述。如果未配置该功能:
- 在选项下
the shutdown-on-failure option.
配置所有流量接口。set chassis high-availability services-redundancy-group 0 shutdown-on-failure interface-name
[edit] set chassis high-availability services-redundancy-group 0 shutdown-on-failure ge-0/0/0 set chassis high-availability services-redundancy-group 0 shutdown-on-failure ge-0/0/1 set chassis high-availability services-redundancy-group 0 shutdown-on-failure ge-0/0/3 set chassis high-availability services-redundancy-group 0 shutdown-on-failure ge-0/0/4
谨慎:请勿使用为机箱间链路 (ICL) 分配的接口。
升级多节点高可用性软件
让我们升级充当备份节点的设备 (SRX-02)。
- 启动软件升级并提交配置。
user@srx-02# set chassis high-availability software-upgrade
此命令将接口标记为脱机,并将状态转换为不合格状态。
- 检查多节点高可用性状态。
user@srx-02> show chassis high-availability information Node failure codes: HW Hardware monitoring LB Loopback monitoring MB Mbuf monitoring SP SPU monitoring CS Cold Sync monitoring SU Software Upgrade Node Status: OFFLINE [ SU ] Local-id: 2 Local-IP: 10.22.0.2 HA Peer Information: Peer Id: 1 IP address: 10.22.0.1 Interface: ge-0/0/2.0 Routing Instance: default Encrypted: YES Conn State: UP Cold Sync Status: COMPLETE Services Redundancy Group: 0 Current State: ISOLATED [ Node Failure ] Peer Information: Peer Id: 1 Shut-on-failures interfaces: ge-0/0/4 ge-0/0/3 ge-0/0/1 ge-0/0/0 SRG failure event codes: BF BFD monitoring IP IP monitoring IF Interface monitoring CP Control Plane monitoring Services Redundancy Group: 1 Deployment Type: ROUTING Status: INELIGIBLE Activeness Priority: 1 Preemption: DISABLED Process Packet In Backup State: NO Control Plane State: N/A System Integrity Check: COMPLETE Failure Events: NONE Peer Information: Peer Id: 1 Status : ACTIVE Health Status: HEALTHY Failover Readiness: N/A
输出显示
Node Status: OFFLINE [ SU ]
,表示节点已准备好进行软件升级。您还可以将 SRG0 状态视为,SRG1 状态ISOLATED [ Node Failure ]
INELIGIBLE
视为 。 -
检查接口的状态。
user@host> show interfaces terse Interface Admin Link Proto Local Remote ge-0/0/0 down down ge-0/0/1 down down ge-0/0/2 up up ge-0/0/2.0 up up inet 10.22.0.2/24 ge-0/0/3 down down ge-0/0/3.0 up down inet 10.3.0.2/16 ge-0/0/4 down down ge-0/0/4.0 up down inet 10.5.0.1/16
输出显示标记为 的
shutdown-on-failure
接口已关闭。 - 确认其他设备 (SRX-01) 处于活动角色且运行正常。
user@srx-01> show chassis high-availability information Node failure codes: Node failure codes: HW Hardware monitoring LB Loopback monitoring MB Mbuf monitoring SP SPU monitoring CS Cold Sync monitoring SU Software Upgrade Node Status: ONLINE Local-id: 1 Local-IP: 10.22.0.1 HA Peer Information: Peer Id: 2 IP address: 10.22.0.2 Interface: ge-0/0/2.0 Routing Instance: default Encrypted: YES Conn State: UP Cold Sync Status: COMPLETE Services Redundancy Group: 0 Current State: ONLINE Peer Information: Peer Id: 2 SRG failure event codes: BF BFD monitoring IP IP monitoring IF Interface monitoring CP Control Plane monitoring Services Redundancy Group: 1 Deployment Type: ROUTING Status: ACTIVE Activeness Priority: 200 Preemption: ENABLED Process Packet In Backup State: NO Control Plane State: READY System Integrity Check: N/A Failure Events: NONE Peer Information: Peer Id: 2 Status : INELIGIBLE Health Status: UNHEALTHY Failover Readiness: NOT READY
输出显示 SRG1 的状态为
ACTIVE
。另请注意,在 SRG1 的部分下
Peer Information
,状态为INELIGIBLE
指示另一个节点处于不合格状态。 - 在 SRX-02 上安装 Junos OS 映像。
user@srx-02> request system software add /var/tmp/junos-install-vsrx3-x86-64-22.3R1.3.tgz no-copy
- 成功升级后,使用命令重新启动
request system reboot
设备。 - 检查 Junos OS 版本。
user@srx-02> show version
Hostname: srx-02 Model: vSRX Junos: 22.3R1.3输出确认设备已升级到正确的 Junos OS 版本。
- 检查设备上的多节点高可用性状态。
user@srx-02> show chassis high-availability information
Node failure codes: HW Hardware monitoring LB Loopback monitoring MB Mbuf monitoring SP SPU monitoring CS Cold Sync monitoring SU Software Upgrade Node Status: OFFLINE [ SU ] Local-id: 2 Local-IP: 10.22.0.2 HA Peer Information: Peer Id: 1 IP address: 10.22.0.1 Interface: ge-0/0/2.0 Routing Instance: default Encrypted: YES Conn State: UP Cold Sync Status: COMPLETE Services Redundancy Group: 0 Current State: ISOLATED [ Node Failure ] Peer Information: Peer Id: 1 Shut-on-failures interfaces: ge-0/0/4 ge-0/0/3 ge-0/0/1 ge-0/0/0 SRG failure event codes: BF BFD monitoring IP IP monitoring IF Interface monitoring CP Control Plane monitoring Services Redundancy Group: 1 Deployment Type: ROUTING Status: INELIGIBLE Activeness Priority: 1 Preemption: DISABLED Process Packet In Backup State: NO Control Plane State: N/A System Integrity Check: COMPLETE Failure Events: NONE Peer Information: Peer Id: 1 Status : ACTIVE Health Status: HEALTHY Failover Readiness: N/A命令输出继续将节点状态
OFFLINE [ SU ]
显示为,将 SRG0 状态ISOLATED [ Node Failure ]
显示为 。 software-upgrade
删除语句并提交配置。user@srx-02# delete chassis high-availability software-upgrade
-
再次检查设备上的多节点高可用性状态,并确认设备处于联机状态且总体状态为正常。
user@srx-02> show chassis high-availability information Node failure codes: HW Hardware monitoring LB Loopback monitoring MB Mbuf monitoring SP SPU monitoring CS Cold Sync monitoring SU Software Upgrade Node Status: ONLINE Local-id: 2 Local-IP: 10.22.0.2 HA Peer Information: Peer Id: 1 IP address: 10.22.0.1 Interface: ge-0/0/2.0 Routing Instance: default Encrypted: YES Conn State: UP Cold Sync Status: COMPLETE Services Redundancy Group: 0 Current State: ONLINE Peer Information: Peer Id: 1 Shut-on-failures interfaces: ge-0/0/4 ge-0/0/3 ge-0/0/1 ge-0/0/0 SRG failure event codes: BF BFD monitoring IP IP monitoring IF Interface monitoring CP Control Plane monitoring Services Redundancy Group: 1 Deployment Type: ROUTING Status: BACKUP Activeness Priority: 1 Preemption: DISABLED Process Packet In Backup State: NO Control Plane State: READY System Integrity Check: COMPLETE Failure Events: NONE Peer Information: Peer Id: 1 Status : ACTIVE Health Status: HEALTHY Failover Readiness: N/A
输出显示
Node Status: ONLINE
和 SRG0ONLINE
,表示节点已重新联机并正常运行。 -
验证接口的状态。
user@srx-02> show interfaces terse Interface Admin Link Proto Local Remote ge-0/0/0 up up gr-0/0/0 up up ge-0/0/1 up up ge-0/0/2 up up ge-0/0/2.0 up up inet 10.22.0.2/24 ge-0/0/3 up up ge-0/0/3.0 up up inet 10.3.0.2/16 ge-0/0/4 up up ge-0/0/4.0 up up inet 10.5.0.1/16 .............................
输出显示,以前关闭的接口现在已启动。
-
检查接口、路由协议、播发的路由等,以确认您的设置正常运行。
现在,您可以使用相同的过程继续升级其他设备 (SRX-01)。