Basic Approaches to Troubleshooting
This section discusses the following aspects of troubleshooting:
- Troubleshooting Process
- Identify the Symptoms
- Isolate the Cause
- Take Corrective Action
- Evaluate the Solution
For a troubleshooting example, see Example: Strategy for Isolating a Broken Network Connection.
Troubleshooting Process
Troubleshooting can be simplified by following standard procedures, as illustrated by Figure 2. Standard troubleshooting procedures include the following steps:
- Identify symptoms. A symptom can be defined as any unwanted results or behavior. A problem or failure might exhibit one or more symptoms.
- Isolate the cause of the symptom.
- Take action to correct the problem.
- Evaluate the system to see if the original problem is solved and to verify that new problems have not been introduced by the changes you made.
- At this point, either you have solved the problem, or you must return to Steps 1, 2, or 3 and identify the symptoms more clearly, isolate additional possible causes, or take additional action to correct the problem.
During the troubleshooting process, you might isolate causes that require additional troubleshooting before you can continue with the standard process.
![]()
Identify the Symptoms
Identifying symptoms requires careful observation. The best preparation for troubleshooting is knowing your network thoroughly before a problem occurs so that you have a baseline state from which to work. If you understand how the network functions under normal conditions, it is easier to distinguish between normal and abnormal activity.
Sometimes a problem is related to another condition that must be solved first. When identifying symptoms, record as many parameters as you can regarding the offending state. The more information you have, the easier it is to isolate the cause. If you find a set of symptoms, try to decide what they have in common. It is likely that they are related, and noticing as many symptoms as you can provides you with more information as you proceed.
It is also useful to record what changes have taken place since the system was last functioning correctly. Changes in activity are likely to be related to changes in configuration.
Isolate the Cause
A particular symptom can be the result of one or more causes. Successful troubleshooting requires narrowing the focus to find each individual cause for unwanted behavior. While you might find a solution by just trying a variety of actions, you reach the intended solution more quickly if you systematically approach the problem.
There are several useful methods for isolating a problem:
- Retrace your steps—Try to return to a state that existed before the problem appeared. When the network is in a known state, take small steps forward, watching carefully for the recurrence of symptoms.
- Divide the problem into its smallest unit—Cut the problem in half and test each half. If only one half continues to have the problem, cut it in half again or compare it to the valid half to see how it is different. You might find the solution in the difference.
- Identify which functions are working correctly—Do not waste time investigating functions that are not broken.
- Keep careful records of changes and effects—Ask questions and document changes as you work on a system.
- Notice how various symptoms might be related—If you are finding unexpected or unwanted results in more than one area, try to discover what those areas have in common and what variables would affect them. You probably will find the source for the problem in the common areas.
- Imagine what type of errors or failures could lead to the particular symptom—Test for the errors or failures to see if they are actually occurring.
- Do not try to solve multiple unrelated problems simultaneously—If multiple symptoms occur that do not appear to be related, select one symptom or set of symptoms and focus on it. However, do not completely ignore the other symptoms, because you might discover that they are related after all.
Several useful tools exist for isolating the cause of a problem, including network analyzer traces, core dumps, serial line traces, stack dumps, and the output from various
showcommands in CLI. For information about theshowcommands, see the chapters in this manual that describe the JUNOS monitoring commands.Take Corrective Action
The action required depends on the type of problem you have isolated. As you troubleshoot, keep in mind the following principles:
- Document each step you take.
- Use the various CLI
showcommands to verify which behaviors change with each action you take.- When you are considering several possible actions, you can choose to test the easiest first, thereby eliminating possibilities quickly, or you can choose the action that appears most likely to solve the problem, even if it is more time-consuming or difficult to perform.
Evaluate the Solution
Carefully test the solution to ensure that it does not introduce new symptoms. If new symptoms occur, start the troubleshooting process again, carefully documenting the changes you make in the process.
Example: Strategy for Isolating a Broken Network Connection
To illustrate the troubleshooting process, we examine a problem that appears to include a broken network connection. By applying the strategy listed below and shown in Figure 3, you can usually isolate the failed node:
- Identify symptom—Failure to reach remote host.
- Isolate causes—Several possible causes are identified, including:
- Local router is misconfigured.
- Intermediate router is misconfigured.
- Remote router is misconfigured.
- No path to the remote router in the local routing table.
- Check the local router's configuration and edit if appropriate.
- Troubleshoot intermediate router.
- Check the remote host configuration and edit if appropriate.
- Troubleshoot routing protocols.
- Identify additional possible causes.
- Evaluate solution—If the problem is solved, you are done. If the problem remains or a new problem is identified, start the process over again.
You can address possible causes in any order. In Figure 3, we chose to work from the local router toward the remote router, but you might start at a different point, particularly if you had reason to believe the problem was related to a known issue, such as a recent change in configuration.
Often, troubleshooting one symptom will uncover other symptoms. Figure 3 shows two possible causes that might involve additional troubleshooting.
![]()