A recent story in the NY Times caught my attention. It is about a block in the Cleveland area where some of their wireless car key fobs and garage door openers stopped working. The block is near a NASA research facility, so that was an obvious first suspect. But it wasn’t. The actual source of the problem turned out to be an inventor that was flooding the radio spectrum at the same frequency as the fobs use: 315 Mhz. Once the radio emitter was turned off, the fobs and garage doors started working again. The issue was the inventor’s radio signal was so strong it was preventing anything else from transmitting on that frequency. He had no idea that he was the source of the radio interference.
This story reminds me of an experience that I had back in 1991 or so. At the time, I was the editor-in-chief of Network Computing magazine for what was then called CMP. It was a fun and challenging job, and one day I got a call from one of my readers who was the IT manager for the American Red Cross headquarters in DC. This was Jon Arnold, who spent a long career in IT, sadly dying of a heart attack several years ago. Turns out they had a chapter in Norfolk Va. that was having networking issues. Their office was a small one, of about 25 or so staffers as I recall. Every day their network would start slowing down and then eventually go kaput for several hours. It was at a different time during the day, so it wasn’t the Cleaning Person Problem (I will get to that in a moment). It would come back online sometimes by itself, sometimes with a server reboot. The IT manager asked if I would be willing to lend and hand, and the first person that I thought of to help me was Bill Alderson.
I first met Bill when he was a young engineer for Network General, which made the fabulous Sniffer protocol analyzer. Many of you who are not from that generation may not realize what this tool was or how much of a big deal it was to have a device that could record packet traffic and examine it bit by bit. Today we have open source tools that do the same thing for free, but back then the Sniffer cost four or five figures and came with a great deal of training. Bill cut his teeth on this product and now has his own company, HopZero that has an interesting way to protect your servers by restricting their hop count.
Bill and I first met back in 1989 when I worked at PC Week and we wanted to test the first local area network topologies. We set up three networks, running Ethernet, Token Ring and Arcnet in a networked classroom at UCLA during spring break. All were connected to the same Novell Netware server. Ethernet won the day (as you can see in this copy of the story), and the other topologies died of natural causes. But I digress.
Jon, Bill and I flew to Norfolk and spent a day with the Red Cross staff to try to figure out what was happening with their Novell network. We did all sorts of packet captures that weren’t conclusive. Our first thought was that it had to be something wrong with the server, but we didn’t see anything wrong. Our second thought was more insidious. Being in Norfolk, we were directly down the road from the naval base (you could say that about much of the town, it is a big base). We actually managed to get through to the base commander to find out if their radar was active when they were coming into port. Imagine making that phone call these days in our post-9/11 world? Anyway, the answer we got was negative. Eventually, after hours of shooting down various theories, we figured out the cause of the problem was a wonky network adapter card in someone’s PC. It usually operated just well enough that it didn’t interfere with the network most of the time. Once we replaced the card, everything went swimmingly, and we could put away our tin-foil hats.
Okay, so what is the Cleaning Person Problem? This sounds like folklore, but another reader told me about a problem they had on their network years ago. The reader was periodically disconnected from his network at the same time every night. He was one of the few people online at the time in their office, so it wasn’t like there was high traffic across the network. Eventually, after several evenings he figured out the problem: The cleaning crew was vacuuming the rug in the server room, and the network cable to the server was being run over by the vacuum. Because the cable wasn’t properly crimped and because it was run under the carpet (who knows why this was done), it was shaken just loose enough to disconnect it from the server. When the crew was finished, the cable would operate just fine. Thankfully, they made a better cable and ran it elsewhere where no one could step on it.
The Cleveland folks that had their car fobs disabled actually had it easy: the fault was in a very deliberate emitter that – while initially difficult to trace – was a very binary cause. Their challenge was that not every car fob and garage door was affected. The two scenarios that I mention here were not so cut-and-dried, which made troubleshooting them more difficult. So keep these stories in mind when you are troubleshooting your next computer or networking problem, and don’t be so quick to blame user error. It could be something not as obvious as the odd radio transmission.