My previous post discussed generic topics for troubleshooting. This post builds on that idea, focusing on the act of problem isolation and simplification.
Occasionally at work we’ll get reports of call failures, and when we look at them we’ll see (relatively) complicated call flows, such a call that goes to an auto-attendant, then one extension, then transferred to another, which encounters a failure like, “cannot retrieve a call from hold”. Unfortunately, it’s likely that this is not a simplified problem.
Now, if this is the ONLY way that the failure can be reproduced, then that call flow is perfectly acceptable. If, on the other hand, it can be reproduced as a call from one extension to another cannot be retrieved from hold, then that is the example to look at. There are simply less variables to contend with. If it requires someone analyze the trace logs, there is less data to dig through. The act of reducing the problem to the simplest scenario in which it occurs will reduce the work necessary to reach resolution, and less work usually means less time.
The reverse of this would be knowing that there is an intermittent problem with retrieving calls from hold, testing a call from extension to extension and not having the problem occur, but never looking into the actual scenario where it does happen. Observation, trial and error should be applied to reach a more detailed description. Sometimes, the “critical step” to reproduce is elusive, and may not be found in this way. That is understandable. It doesn’t mean that an attempt to isolate the problem should not be made.