This document describes the process for troubleshooting the duplicate asset ID error with Automated Provisioning (AP).
Here are a few situations that may cause a duplicate asset ID:
- 128T SSR has been cloned from staging and has not completed the initialization procedure, creating a unique ID
- The same quickstart file has been applied to multiple 128T SSRs
- A user has manually entered the same asset ID into the salt-minion ID file on multiple 128T SSRs
The terms "salt-master" and "conductor" are used interchangeably throughout this document. "Salt-master" refers to the salt-master process running on the conductor, which orchestrates tasks for AP. Also the terms "minion", "salt-minion" and "asset" are used interchangeably throughout this document. "Minion" runs on an "asset", or system hosting a 128T router. Minions are responsible for carrying out tasks on the host, given to it by the salt-master.
This document will reference salt keys and salt grains. Some good resources for these topics are:
salt-key command referenced above can be accessed from the Linux shell with
t128-salt-key. All arguments to the command and behavior are identical.
An asset that has a duplicate asset ID error can be seen from the conductor CLI using the
show assets command with a provided asset-id. For example:
A duplicate asset ID error can also be seen as an alarm event from the conductor CLI using the
show events alarm. For example:
Lastly a denied key can be seen by querying the salt-master directly from the Linux command line:
t128-salt-key commands can only be executed by the
root user at the Linux shell.
There are multiple situations that may cause a duplicate asset ID error.
- A single salt-minion disconnected, had its key regenerated, and reconnected
- In this case, the asset will show as disconnected and have a duplicate asset ID error
- Two salt-minions with the same ID attempted to connect
- In this case, either the connected asset or the denied minion may be the correct one
- The correct asset may also be disconnected when another minion with the same ID connects, making it difficult to distinguish which situation caused the asset ID error
A denied key is a public key that is rejected automatically by the salt-master when the salt-minion tries to authenticate with its public key. A rejected key is a different state that indicates a user manually rejected a salt key. There are two different situations that can result in the salt-master denying a key:
- A salt-minion was rebuilt, reinstalled or the public/private key pair was wiped and new keys were automatically regenerated when the salt-minion service started. Now the minion's keys do not match the salt keys previously associated with the minion ID on the conductor.
- There are two salt-minions trying to authenticate with the same salt-minion ID.
In the denied key state the salt-minion does not receive any communication from the salt-master. The conductor automatically creates a duplicate asset ID error when it detects that the salt-master has moved a salt key into the denied state.
Steps to Rectify
There are two different procedures for rectifying the two different situations. If the user is unsure which situation they are in, then they should try the procedure for rectifying a salt-minion with new keys first, before moving onto locating multiple minions with the same ID.
Rectifying A Salt-Minion with New Keys
The user needs to drop to the Linux shell and delete the accepted and denied keys with the same asset ID using the
t128-salt-key -d <asset-id> command. For example:
If the conductor is in HA configuration then the salt keys need to be deleted from both conductors or the asset will be stuck in the
The salt-minion may reach out again automatically and get accepted by the conductor if the asset ID matches an asset ID in the 128T configuration. If the salt-minion does not reach out then it needs to be restarted manually from the Linux command line with
systemctl restart salt-minion.
If the root cause was indeed a salt-minion that regenerated new keys (situation #1 above) and not two different salt-minions with different keys and the same ID (situation #2 above), then this procedure will solve your problem. The asset will reconnect properly and no duplicate ID error will appear. If the duplicate asset ID error and the denied and accepted keys reappear, then the user knows they are dealing with multiple salt-minions with the same minion ID.
Rectifying Multiple Salt-Minions with the Same Minion ID
If the user tried the procedure from the previous section and the duplicate ID error and denied keys have reappeared then the user is dealing with multiple salt-minions with the same ID. Unfortunately, the only course of action in this case is to track down the system with the improper ID. After tracking down the system the user simply needs to change the ID located at
/etc/salt/minion_id and restart the salt minion with
systemctl restart salt-minion. The denied key associated with the old salt-minion ID will be deleted by the conductor automatically after one minute when the old ID stops trying to authenticate.
Tracking down the asset with the duplicate ID is not always easy. Since the salt-master automatically denies the authentication attempt from this bad actor, the conductor has no insight into the IP address or any other information about this asset. Secondly, the user does not know if the key that is currently denied belongs to the asset with the incorrect ID, or the correct one. Whichever system tries to authenticate first will get accepted and the other system will get denied.
The user should start by trying to validate the currently connected router to ensure it is the correct one. The user can connect directly to the router's PCLI via the conductor's PCLI by using the
connect command. This command allows the user to login to the router's PCLI directly. The user can then run any PCLI command which will help them validate the currently connected router. For example, connecting to the remote router and checking the network interface IP addresses:
The user can also use the salt command line from the conductor's Linux shell to try and retrieve information from the asset who has been accepted to determine if this is the bad actor or the correct asset. For example, running the salt command
grains.items will return a large amount of information about the system. Or the command
grains.get can return specific pieces of information about the system. This might give the user a clue if the currently connected system has the correct minion ID. In the example below, the hostname grain retrieved was
asset10, and if the customer is matching asset IDs to hostnames, then this may be a clue that this system is causing the duplicate ID error because the minion ID should be
Unfortunately the user can only query information about the system with the key that is accepted. If the user has validated that the currently connected system is the correct system, then they could try the following procedure to locate the bad actor:
- Login to the correct system's Linux shell directly and stop the salt-minion with
systemctl stop salt-minion
- Only do this if the user can maintain connectivity to this system so they can start the salt-minion again after this situation is resolved
- Delete both the accepted and denied salt-keys on the conductor(s) with
t128-salt-keyas stated above
- Wait for the bad actor to connect successfully via salt
t128-salt '<asset-id>' file.write /etc/salt/minion_id "<new-asset-id>"to remotely update the salt-minion ID file then restart the salt-minion remotely with
t128-salt '<asset-id>' service.restart salt-minion
- Delete the accepted salt key from the conductor(s)
- Start the salt-minion on the correct system
If this procedure does not work then the correct course of action is to deny both duplicate salt keys until the systems have been properly authenticated. Once the correct system is found, regenerate a new minion ID and keys for the authentic minion and keep the bad actor system's key denied until it can be claimed.