Understanding How the Standby Site Becomes Operational When the Active Site Goes Down
When a disaster causes the active site to go down, if automatic failover is enabled and the standby site can exceed the failure threshold, the standby site becomes operational. Otherwise, you may need to execute the jmp-dr manualFailover or jmp-dr manualFailover –a command at the standby site to resume network management services.
The disaster recovery watchdog at the standby site performs the following failover operations to become an active site:
Verify that the VIP address at the active site is not reachable.
Stop database replication and SCP file transfer between the two sites.
Remove the cron job from the standby site for fetching backup files from the active site.
Add a cron job at the standby site to back up configuration and RRD files.
Modify the role of the standby site to active.
Open port 7804 on all nodes at the standby site.
Start all services at the standby site.
Copy system configuration files contained in the backup to appropriate locations.
Configure all devices to send SNMP traps to the VIP address of the standby site. If eth3 is used for device management at the standby site, the eth3 IP address of the active‐VIP node at the standby site is configured as the trap destination, instead of the VIP address.
If you are monitoring devices through a dedicated FMPM node, the VIP address of the dedicated node is configured as the trap destination.
After the failover is complete, the disaster recovery role of the site is set to Active and the state of the cluster is set to active (1). You can access the GUI and API of the standby site from its VIP to perform all network management tasks. In most cases, the failover should happen within 20 to 30 minutes. When the active site becomes operational again, it becomes the standby site. You can either retain the failed state or choose to revert to the original state.