THE SIX RULES OF AUDIO NETWORK TROUBLESHOOTING
Manufacturers of audio networking equipment often promise a ‘plug and play’, hassle-free user experience, with a ‘sky’s the limit’ channel capacity. If you keep things simple, in most cases it’s true. Of course, once you start to make things more complex - for example by combining multiple brands and product types in a large distributed system - things start to get more complicated. Still, when systems are designed with care and enough forethought, it’s entirely possible to make things work - but the paradigm changes from ‘plug and play’ to ‘think, plug and play’.
Sometimes, the ‘think’ part of the paradigm suffers for some reason - for example because two companies designed their own subsystems and then connected them on the job, without having discussed or tested what could happen. Or because last minute changes have introduced new components on the network. Or because firmware management for some products has lagged… and so forth.
If a problem arises then, of course, it needs to be solved. In many cases this has to happen on the spot - for example at a concert with thousands of concertgoers impatiently waiting for the band to kick off. As a major manufacturer of networked audio equipment, we have quite some experience in troubleshooting. Over the years we have developed a guideline of six troubleshooting rules, which we’d like to share with you.
Rule 1: Don’t panic.
Almost all systems that have a problem can be fixed. Some systems have servicing and monitoring functions built in to make troubleshooting easy, some systems don’t and maybe have to be downscaled to less functionality to get them to work. But nevertheless, in our experience, we can make always it work. The important thing is to not panic. To help not panicking, some experience in troubleshooting helps - so it may be worthwhile to practise. Setup a system in the office, ask someone to sabotage it and see if you can fix it.
Rule 2: Make a plan.
‘Let’s try anything’ will cost you a lot of time and probably not get you anywhere. Instead, successful troubleshooting sessions start with analysis. First listen to the person who reported the problem, then investigate by yourself to define the problem clearly and formally. After the problem has been defined, we can focus on thinking about possible causes, testing them, finding the actual problem and then thinking of a solution to fix it. It’s a simple workflow, but in the heat of the moment easily forgotten.
Rule 3: Don’t trust anyone.
Assume that, at some point in time, the system was delivered to the user in working order. Over time something inside the system - either a hardware malfunction or a design flaw - might cause a problem. But, in our experience, in the vast majority of cases the problem is caused by a change made to the system by a human after delivery… a setting changed, a cable plugged into a wrong socket, a piece of hardware exchanged, added or removed… things like that. In most cases the human who made the change is not aware that he or she caused a problem, so don't rely on statements of users too much. Trust no one and check everything yourself.
Rule 4: One by one.
Testing tends to work best if you test only one hypothesis at the time, providing reliable information about possible causes. When rushed, it’s tempting to test multiple things at the same time but, because causes interact, this mostly works against you. Also, if multiple people are troubleshooting the same system, their actions may interact as well. So, the rule is to test one possible cause at the time. If multiple engineers or teams are working on the system, make a schedule to access the system one by one, to avoid chaos.
Rule 5: Documentation.
When troubleshooting it is a huge help to have any form of system documentation; what is connected to what and what the system is supposed to do. Very often, especially in live touring, documentation is not available, requiring the troubleshooting engineer to assume, guess, investigate and hypothesise. Some engineers enjoy that, but it takes time. It’s always better to have some form of documentation, so ask the user / owner.
Rule 6: Start with the obvious.
The majority of the problems in networked audio systems are caused by just a few standard issues. So, based on our experience, it makes sense to start any troubleshooting session by a standard series of six system checks:
- Check all cable connections.
- Check all audio word clock settings.
- Check and list all physical devices in the system and compare the list with the system documentation; any missing or added devices are candidates for problems.
- Check the IP range for network numbers, duplicate fixed addresses and double DHCP servers.
- Confirm that Energy Efficient Ethernet is disabled on all switches.
- Confirm that all devices on the network have matching firmware.
These checks will probably solve 90% of potential problems - no kidding.