High Availability for Subscriber Access Networks
This topic is a high-level overview of high availability for DHCP, L2TP, and PPP access networks.
Unified ISSU for High Availability in Subscriber Access Networks
A unified in-service software upgrade (unified ISSU) enables you to upgrade between two different Junos OS Releases with no disruption on the control plane and with minimal disruption of traffic. The routers preserves the active subscriber sessions and session services across the upgrade, so that they continue after the upgrade has completed.
The unified ISSU feature supports the PPPoE, DHCP, and L2TP access models for subscriber management. Unified ISSU support for the DHCP and L2TP access models was added in Junos OS Release 14.1.
For static and dynamic PPPoE access, unified ISSU supports the following:
Terminated, non-tunneled PPPoE connections configured with static or dynamic PPP logical interfaces and static or dynamic underlying interfaces
Subscriber services on single-link PPP interfaces
Preservation of statistics for accounting, filter, and CoS on MPC/MIC interfaces
Unified ISSU for the subscriber management PPPoE access model does not support Multilink Point-to-Point Protocol (MLPPP) bundle interfaces. MLPPP bundle interfaces require the use of an Adaptive Services PIC or Multiservices PIC to provide PPP subscriber services. These PICs do not support unified ISSU.
For DHCP access, unified ISSU supports the following:
DHCPv4 local server, DHCPv4 relay, DHCPv6 local server, DHCPv6 relay, and DHCP relay proxy
Preservation of accounting, filter, and class-of-service (CoS) statistics for DHCP subscribers on MPC/MIC interfaces on MX Series routers
For L2TP access, unified ISSU supports both the LAC and the LNS. When an upgrade is initiated, the LAC completes any L2TP negotiations that are in progress but rejects any new negotiations until the upgrade has completed. No new tunnels or sessions are established during the upgrade. Subscriber logouts are recorded during the upgrade and are completed after the upgrade has completed.
See Getting Started with Unified In-Service Software Upgrade for a description of the supported platforms and modules, CLI statements, and procedures you use to configure and initiate unified ISSU. You can use the issu flag with the traceoptions statement to trace subscriber management unified ISSU events. You can also use the show system subscriber-management summary command to display information about the unified ISSU state.
Verifying and Monitoring Subscriber Management Unified ISSU State
Display the state of unified ISSU for subscriber management features.
The first example indicates that control plane quiescing as part of unified ISSU is not in progress (for example, unified ISSU has not been started, has already completed, or control plane queiscing has not started). The second example shows that unified ISSU is in progress and that a participating subscriber management daemon requires 198 seconds to quiesce the control plane.
user@host> show system subscriber-management summary
General: Graceful Restart Enabled Mastership Master Database Available Chassisd ISSU State IDLE ISSU State IDLE ISSU Wait 0
user@host> show system subscriber-management summary
General: Graceful Restart Enabled Mastership Master Database Available Chassisd ISSU State DAEMON_ISSU_PREPARE ISSU State PREPARE ISSU Wait 198
Graceful Routing Engine Switchover for Subscriber Access Networks
The graceful Routing Engine switchover (GRES) feature in Junos OS enables a router with redundant Routing Engines to continue forwarding packets, even if one Routing Engine fails. GRES preserves interface and kernel information. Traffic is not interrupted. However, GRES does not preserve the control plane.
To enable GRES support on MX Series routers, include the graceful-switchover statement at the [edit chassis redundancy] hierarchy level.
For MX Series routers, the extended DHCP local server and the DHCP relay agent applications both maintain the state of active DHCP client leases in the session database. The extended DHCP application can recover this state if the DHCP process fails or is manually restarted, thus preventing the loss of active DHCP clients in either of these circumstances. However, the state of active DHCP client leases is lost if a power failure occurs or if the kernel stops operating (for example, when the router is reloaded) on a single Routing Engine.
You cannot disable graceful Routing Engine switchover support for the extended DHCP application when the router is configured to support graceful Routing Engine switchover.
For more information about using graceful Routing Engine switchover, see Understanding Graceful Routing Engine Switchover.
GRES is supported on MX Series routers acting as either the L2TP LAC or LNS. In the event that L2TP (jl2tpd, the L2TP universal edge process) restarts or that the router fails over from the active routing engine (RE) to the standby RE, L2TP GRES ensures that the following occurs:
The LAC and the LNS recover destinations, tunnels, and sessions that were already established at the time of the failure or restart.
The LAC and the LNS respond to tunnel keepalive requests received during the switchover for established tunnels, but do not generate any keepalives until the switchover is complete.
The LAC and the LNS delete all the tunnels and sessions that are not in the Established state.
The LAC and the LNS reject requests to create new tunnels and sessions.
The LAC and the LNS send another disconnect notification to the peer for sessions and tunnels that are already in the Disconnecting state at the time of the failure or restart. For sessions and tunnels that were coming up at that time, the LAC and LNS send a disconnect notification to the peer.
The LAC and the LNS restart timers for the full timeout period for recovered L2TP destinations, tunnels, and sessions.
If a graceful Routing Engine switchover (GRES) is triggered by an operational mode command, the state of aggregated services interfaces (ASIs) are not preserved. For example:
However, if GRES is triggered by a CLI commit or FPC restart or crash, the backup Routing Engine updates the ASI state. For example:
Minimize Traffic Loss Due to Stale Route Removal After a Graceful Routing Engine Switchover
During a graceful Routing Engine switchover (GRES), access routes and access-internal routes for DHCP and PPP subscriber management can become stale. After the GRES, the router removes any such stale routes from the forwarding table. Some traffic is lost if the stale routes are removed before the routes are reinstalled.
In subscriber networks with graceful restart and routing protocols such as BGP and OSPF configured, the router purges any remaining stale access routes and access-internal routes as soon as the graceful restart operation completes, which can occur very soon after completion of the graceful Routing Engine switchover.
In subscriber networks with nonstop active routing (NSR) and routing protocols such as BGP and OSPF configured, the routing protocol process (rpd) immediately purges the stale access routes and access-internal routes that correspond to subscriber routes.
You can reduce the risk of this traffic loss by configuring the router to delay the removal of stale routes after a GRES. The delay period is a nonconfigurable 180 seconds (3 minutes). The router retains the stale routes for the duration of the period, which is long enough for the DHCP client process (jdhcpd), PPP client process (jpppd), or routing protocol process (rpd) to reinstall the access routes and access-internal routes before the router removes the stale routes from the forwarding table. The risk of traffic loss is minimized because the router always has available subscriber routes for DHCP subscribers and PPP subscribers.
To configure the router to delay removal (flushing) of access-routes and access-internal routes after a graceful Routing Engine switchover:
- Specify that you want to configure subscriber management.[edit system services]user@host# edit subscriber-management
- Configure the router to wait 180 seconds before removing
access-routes and access-internal routes after a graceful Routing
Engine switchover.[edit system services subscriber-management]user@host# set gres-route-flush-delay