Determining Why Mastership Switched
Mastership can switch between the master Routing Engine and the backup Routing Engine for the following reasons:
- Hardware problems.
- The master Routing Engine is pulled.
- Software issues, such as a Routing Engine kernel crash.
Action
View the log file
/var/log/mastershipfor redundancy logging. This file contains hardware and software transitions to help debug auto-redundancy issues.user@host> show log mastershipTable 107 lists the event codes that can be displayed in the mastership log.
Table 107: Logging Events
Sample Output
user@host> show log mastershipJan 12 21:50:05 clear-log[865]: logfile clearedJan 12 21:50:18 failed to receive keepalives from other RE for the last 60 sec Jan 12 21:50:23 failed to send RE info/keepalive: errno=22, total=6 in the last 20 secJan 12 21:50:23 failed to send RE info/keepalive: errno=22, total=6 in the last 20 secJan 12 21:50:34 event = E_CMD_R, state = master, param = 0x0 Jan 12 21:50:34 send "you are the master" request Jan 12 21:50:34 Failed to send RE mastership cmd. err = 65 Jan 12 21:50:34 Currentstate: master NextState:giveupreason_code: 1Jan 12 21:50:34 timestamp: Wed Jan 12 21:50:34 2000Jan 12 21:50:34 new state = giveupJan 12 21:50:36 event = E_TMOUT, state = giveup, param = 0x0 Jan 12 21:50:36 send "you are the master" requestJan 12 21:50:36 Failed to send RE mastership cmd. err = 65 Jan 12 21:50:36 Currentstate: giveup NextState:giveupreason_code: 1Jan 12 21:50:36 new state = giveupJan 12 21:50:38 event = E_TMOUT, state = giveup, param = 0x0Jan 12 21:50:38 send "you are the master" requestJan 12 21:50:38 Failed to send RE mastership cmd. err = 65Jan 12 21:50:38 Currentstate: giveup NextState:giveupreason_code: 1Jan 12 21:50:38 new state = giveupJan 12 21:50:40 failed to receive keepalives from other RE for the last 80 sec Jan 12 21:50:41 event = E_TMOUT, state = giveup, param = 0x0Jan 12 21:50:41 send "you are the master" requestJan 12 21:50:41 Failed to send RE mastership cmd. err = 65Jan 12 21:50:41 Currentstate: giveup NextState:giveupreason_code: 1Jan 12 21:50:41 new state = giveupJan 12 21:50:43 event = E_TMOUT, state = giveup, param = 0x0Jan 12 21:50:43 send "you are the master" requestJan 12 21:50:43 Failed to send RE mastership cmd. err = 65Jan 12 21:50:43 Currentstate: giveup NextState:giveupreason_code: 1Jan 12 21:50:43 new state = giveupJan 12 21:50:46 failed to send RE info/keepalive: errno=35, total=7 in the last 20 secJan 12 21:50:46 failed to send RE info/keepalive: errno=35, total=7 in the last 20 secJan 12 21:50:46 event = E_TMOUT, state = giveup, param = 0x0Jan 12 21:50:46 send "you are the master" requestJan 12 21:50:46 Failed to send RE mastership cmd. err = 65Jan 12 21:50:46 Currentstate: giveup NextState:giveupreason_code: 1Jan 12 21:50:46 new state = giveupJan 12 21:50:48 event = E_TMOUT, state = giveup, param = 0x0Jan 12 21:50:48 send "you are the master" requestJan 12 21:50:48 Failed to send RE mastership cmd. err = 65Jan 12 21:50:48 Currentstate: giveup NextState:giveupreason_code: 1Jan 12 21:50:48 new state = giveupJan 12 21:50:50 event = E_TMOUT, state = giveup, param = 0x0Jan 12 21:50:50 send "you are the master" requestJan 12 21:50:50 Failed to send RE mastership cmd. err = 65Jan 12 21:50:50 Currentstate: giveup NextState:giveupreason_code: 1Jan 12 21:50:50 new state = giveupJan 12 21:50:53 event = E_MAXTRY, state = giveup, param = 0x0Jan 12 21:50:53 Currentstate: giveup NextState:masterreason_code: 1Jan 12 21:50:53 timestamp: WedJan 12 21:50:53 2000Jan 12 21:50:53 new state = masterJan 12 21:51:01 failed to receive keepalives from other RE for the last 100 sec Jan 12 21:51:06 failed to send RE info/keepalive: errno=65, total=7 in the last 20 secJan 12 21:51:06 failed to send RE info/keepalive: errno=65, total=7 in the last 20 secJan 12 21:51:21 failed to receive keepalives from other RE for the last 120 sec Jan 12 21:51:26 failed to send RE info/keepalive: errno=22, total=6 in the last 20 secJan 12 21:51:26 failed to send RE info/keepalive: errno=22, total=6 in the last 20 secWhat It Means
The beginning of the log shows that keepalives are not being responded to and the state of the Routing Engine changed from
mastertogiveupafter therequest chassis routing-engine master releasecommand was issued. However, the other Routing Engine is not taking over mastership because it is unreachable. Eventually a timeout (E_TMOUT) occurs until the Routing Engine reaches the maximum number of attempts permitted (E_MAXTRY). The output then shows the Routing Engine state changing fromgiveupback tomaster.The output doesn't indicate why the mastership switchover did not work. However, it is clear that the backup Routing Engine is unreachable.