Multiple cable drops fail at the same time

Fardringle · Apr 29, 2010

I apologize in advance. This will be long, but it has me completely stumped and I want to include as much information as possible in hopes of getting some good suggestions.

The power went out for a short time at one of my client's offices. When the power came back on, nobody was able to access any network resources at all. I determined that the NIC in their domain controller (also acting as DNS and file server) had failed. I replaced that NIC and most of the stations started working again.

This is where it gets odd...

Of the roughly 20 stations that started working after I replaced the server NIC, about half seem to be perfect in that they have no issues and continuous PING spam of 2000+ iterations didn't lose a single packet. I know that perfect PING results don't necessarily mean a truly perfect connection, but for my purposes it was a step in the process of elimination since the other half are losing between 1 and 3% of their packets. This is not a huge problem, but still a concern since they used to be perfect and even a single lost packet can make their office management database program choke. I'll get back to them later since they are working even if they aren't at 100%.

The big problem is that there are 6 workstations that range anywhere from not working at all (can't even get an IP address from the DHCP server and have nearly 100% packet loss with a static IP) to working but only very poorly (20%+ PING packet loss). I used a working laptop (perfect connections at other locations) at all of the bad drops and it failed to connect at all of those drops, and one of the "bad" workstations worked normally at one of the "good" locations, so it doesn't look like an issue with the workstations themselves.

To eliminate the possibility of a bad switch, I took all of the "bad" stations and a few of the "perfect" stations and the server and wired them directly into a new switch isolated from the rest of the network and the bad computers stayed bad and the good ones stayed good.

I thought that somehow one of the modules on the AMP NetConnect 48-port patch panel had gone bad since all six of these stations are wired into the same 6-port module on the panel. However, I rewired two of the stations to a different module where some of the "perfect" stations are connected and the bad stations still had the same connectivity problems. I also wired two of the perfect stations into the presumed bad patch module and those stations still continued to work perfectly.

I also replaced one of the wall ports at one of the bad stations just to make sure the port wasn't bad. It didn't have any effect on the connection.

To me, the only thing left is the cable drops inside the walls. Normally I wouldn't consider that as a possibility with six different connections failing at the same time, but I can't think of anything else that it could be since it's not the server, not the workstations, not the patch panel, not the switches, not the wall ports, and not the patch cables between the workstations and the wall ports. Sadly, I don't have access to a cable tester/verifier so the only way I can eliminate the wall cabling is to replace it. I don't have any spare solid core wire at the moment. I have already ordered a spool of new cable from Monoprice but it won't be delivered until Monday so I wanted to get some ideas from greater minds over the weekend in hopes of resolving the issue in case the new cable doesn't fix the problem.

I wasn't involved when the office was built so I don't know how things are arranged inside the walls, but I do know that each cable drop is enclosed in its own 1 inch conduit with nothing else in the conduit except the standard twine for additional/new pulls. Since I know that each workstation is wired separately to the server room in its own conduit, I can't imagine any way that six different drops could fail at the same time unless the contractors who wired the place made a MAJOR mistake and ran the network conduit directly in contact with exposed or shorted electrical wiring and that electrical wiring overheated and damaged the LAN cables during the power surge/outage. The main flaw in this theory is that even though the bad stations are wired on the same module of the patch panel they aren't particularly close together in the building (they are separate offices along one long wall of the building) and the stations that are getting small amounts of packet loss are scattered all over the office.

I haven't pulled any of the cables out of the conduit yet to check for physical damage since I'm waiting on the delivery of new cable from Monoprice. I'm personally hoping that there isn't any apparent damage since that would mean having a contractor come in and tear the walls apart to get to and correct the problem (the walls are solid wood and plaster and not easily accessible), but I can't think of anything else that could have caused a failure in so many drops at the same time. Also, the fact that many of the still working stations are now getting small numbers of PING packet losses where they were all perfect before the power outage makes me wonder if there is something even more significant wrong with the wiring. Either that or I missed something in my troubleshooting and I'm hoping you can point that out for me..

If you have any ideas, no matter how unlikely, as to what might have caused all of the drops to fail at the same time, or if you think it might be something else causing the problem, I'd appreciate the input since this one has me baffled.

netsysadmin · Apr 29, 2010

Sounds to me like they took a power hit down the networking cable. Could have been from the NIC on the DC that went bad. There would happen to be any cabling that goes outside?

John

Fardringle · Apr 29, 2010

There is no outdoor wiring (network or power) that I know of. There aren't even any power outlets on the exterior walls.

If there was a surge to/from the server NIC that cause the other drops to fail, wouldn't the amount of power required to kill cabling be more than enough to kill the workstations NICs as well? The NICs appear to be fine in all of the computers except for the one that went bad in the server...

RebateMonger · Apr 29, 2010

For completeness, the following are possibilities from any particular client PC:
1) Client PC networking software got damaged
2) Client PC NIC got damaged
3) Client PC patch cable got damaged
4) Client PC wall jack got damaged
5) Cable from wall jack to patch panel got damaged
6) Patch panel jack got damaged
7) Patch cable from patch panel to switch got damaged
8) Switch got damaged
9) Patch cable from switch to server got damaged
10) NIC in server got damaged
11) Server networking software got damaged

What happens when you attach a "strange" PC, like a laptop, in place of a "failing" workstation? Can you get an IP address? Do pings work?

Fardringle · Apr 29, 2010

RebateMonger said:
For completeness, the following are possibilities from any particular client PC:

I think I answered most of these in my book above, but since that was long, I'll take them one at a time here:

1) Client PC networking software got damaged
I moved one of the "bad" machines to a "good" location in the office and it worked fine.
2) Client PC NIC got damaged
See #1. To be thorough, I did replace the NIC in one of the bad workstations but it didn't fix the problem.
3) Client PC patch cable got damaged
Same results with different patch cables
4) Client PC wall jack got damaged
I replaced one of them with no change in connectivity
5) Cable from wall jack to patch panel got damaged
This is the only thing I have not replaced yet, and what I think is the problem. The issue is what could cause many in-wall cables to fail at the same time and how can I prevent it from happening again...
6) Patch panel jack got damaged
Rewiring drop cable to different patch panel ports had no effect
7) Patch cable from patch panel to switch got damaged
Swapping cables between "bad" and "good" connections had no effect
8) Switch got damaged
Same results with a new switch completely isolated from the main office network (but still using the in-wall cables)
9) Patch cable from switch to server got damaged
I replaced this as well, but if this was the culprit then the computers that are "perfect" would also have problems
10) NIC in server got damaged
One of two on-board Server NICs did stop working but was replaced with a PCI card to resolve the initial problems in the whole office after the power outage
11) Server networking software got damaged
Maybe, but see #9

What happens when you attach a "strange" PC, like a laptop, in place of a "failing" workstation? Can you get an IP address? Do pings work?
Working laptop was attached to each of the "bad" locations with a known good patch cable and had the same problem as the workstations.

imagoon · Apr 29, 2010

Time to bust out the etherscope and test the cables. You might be able to rent one. I mean a real tester that can test for resistance/capacitance etc.

Ethernet (even when installed correctly) can be susceptible to transient grounding paths and the like. It is rare but I have seen power outages sink power in to those ports mostly because server racks and the gear bolted to them can be a better ground than the ground plug in the wall. I assume your not using those ground remover plugs etc? Also this isn't shielded is it?

spidey07 · Apr 29, 2010

I've seen this many times before. The most likely cause is bad cabling/grounding that caused a WHOLE TON of secondary failures.

This is why rack/panel grounding is so important.

With the symptoms you described, which I've seen many times before, the most likely cause is the cable plant and grounding. Make sure that is in order and up to spec before going any further. Process of elimination doesn't work for these kinds of problems, you MUST test/verify/inspect the physical layer before moving forward.

Also remember pings are small packets and aren't as likely to fail as most other applications and are not to be trusted other than a basic reachability/routing test. But if you have ANY loss of pings on an actual properly wired local area network then something is indeed wrong.

imagoon · Apr 29, 2010

spidey07 said:
I've seen this many times before. The most likely cause is bad cabling/grounding that caused a WHOLE TON of secondary failures.

This is why rack/panel grounding is so important.

With the symptoms you described, which I've seen many times before, the most likely cause is the cable plant and grounding. Make sure that is in order and up to spec before going any further. Process of elimination doesn't work for these kinds of problems, you MUST test/verify/inspect the physical layer before moving forward.

Also remember pings are small packets and aren't as likely to fail as most other applications and are not to be trusted other than a basic reachability/routing test. But if you have ANY loss of pings on an actual properly wired local area network then something is indeed wrong.

Dude get out of my head..! lol

spidey07 · Apr 29, 2010

imagoon said:
Dude get out of my head..! lol

lulz, I was typing my post, yours wasn't even up there.

But yeah, seen it about 50 times. Even seen huge data centers act flaky because of improper grounding and other EM/noise from not following the standards.

If those patch panels, the racks aren't grounded to a huge bus bar I'm going to tell you to redo it because it wasn't done properly.

imagoon · Apr 29, 2010

spidey07 said:
lulz, I was typing my post, yours wasn't even up there.

But yeah, seen it about 50 times. Even seen huge data centers act flaky because of improper grounding and other EM/noise from not following the standards.

If those patch panels, the racks aren't grounded to a huge bus bar I'm going to tell you to redo it because it wasn't done properly.

You know, one of the funny things is, when we give tours, people always point out the 6 guage green/yellow wires at the bottom of my racks and I then have to pull up the tile above the bar and explain it to them... People think I am strange because I do chassis <=> rack impedance tests with a meter when I mount gear.

It is so easy and cheap to ground. I don't know why people skip it so often.

spidey07 · Apr 29, 2010

imagoon said:
You know, one of the funny things is, when we give tours, people always point out the 6 guage green/yellow wires at the bottom of my racks and I then have to pull up the tile above the bar and explain it to them... People think I am strange because I do chassis <=> rack impedance tests with a meter when I mount gear.

It is so easy and cheap to ground. I don't know why people skip it so often.

Incompetence/ignorance. Doesn't matter, makes me a shit ton of money off their stupidity.

Fardringle · Apr 30, 2010

The metal wire shelf that the server is sitting on is grounded and the shelf where the switches are is wood, but I honestly don't know if the patch panel is grounded. I will check that when I head back to the office in the morning. Thanks!

I do know that PINGS aren't a valid test to verify that a connection is good. I was mostly just using it to find the drops that are definitely bad so that I could work from there to try to determine the location and possible cause of the problem. I don't know of a source here (Salt Lake City, Utah) where I could rent a good cable tester. I'll do some checking and see if I can find a listing online or in the phone book.

Cable God · Apr 30, 2010

You can get a free week trial of an Etherscope II here: http://networking.flukenetworks.com/?elqPURLPage=265

Fardringle · Apr 30, 2010

Good call on the grounding! I'm embarrassed to say that I never thought to look for grounding on the patch panel in the many years that I have been taking care of the network and computers for this office since I was told that it was done a long time ago, but it was not grounded (it is now). Hopefully that will prevent this from happening again, and I'm definitely going to re-check all of my other clients as well.

Thanks for the suggestion on the free trial of the Etherscope, CableGod. This particular office is an insurance agency and one of their clients is an electrician who happens to own a Fluke scanner so they're going to have him come in and verify that the copper pipe I'm using for grounding now is actually grounded and to test the network wiring with his Fluke before I replace any of it. I may go ahead and sign up for the week trial of the Etherscope after I replace the bad wiring so that I can test the new cabling to make sure it is installed properly.

Mogadon · Apr 30, 2010

There's a number of decent places you can rent network testing equipment online. The one i've used would be around $300/wk for a fluke DTX-1200, I think it was, they'd ship it out next day and you'd be responsible for shipping it back freight.

Probably not much use to you now considering their electrician will check it but might be useful for the future, I guess if you felt like looking good you could probably get one shipped out tonight to arrive tomorrow morning.

I'd make sure to check on what model scanner the electrician is using too, i've had 'people tell me they have a great network scanner and then roll up with a $20 tester that only checks the wiring configuration.

Edit - here's a company i've used before for network testing equipment:
http://www.trs-rentelco.com/

RebateMonger · May 1, 2010

I very seldom do any wiring, and I'm curious what the standard method of grounding a patch panel would be? I don't see any grounding lugs on the ones I'm looking at on Newegg (Tripplite and others). I've even seen patch panels with plastic mounting standoffs. Or are we only talking about higher-end patch panels here?

And I hate to sound too ignorant here, but is this the theory (?):

1) A bunch of PCs got hit with a large potential difference between their NICs and the switch at the other end. This wiped out a bunch of cabling.

or

2) The switch got hit with a large potential difference between it and the PCs' NICs, again damaging the cabling in between?

In either case, wouldn't it be more likely that the PCs or the switch would sink the current before an insulated ground path at the patch penel would have much effect?

(I'm not being critical here. I'm one of those moderately ignorant people people when it comes to the effect of power transients).

Fardringle · May 1, 2010

RebateMonger said:
I very seldom do any wiring, and I'm curious what the standard method of grounding a patch panel would be? I don't see any grounding lugs on the ones I'm looking at on Newegg (Tripplite and others). I've even seen patch panels with plastic mounting standoffs. Or are we only talking about higher-end patch panels here?

The patch panel at this office doesn't appear to have a specific grounding spot. I just bolted some 6 gauge wire to a hole in the frame that doesn't seem to have any other purpose.

And I hate to sound too ignorant here, but is this the theory (?):

1) A bunch of PCs got hit with a large potential difference between their NICs and the switch at the other end. This wiped out a bunch of cabling.

or

2) The switch got hit with a large potential difference between it and the PCs' NICs, again damaging the cabling in between?

In either case, wouldn't it be more likely that the PCs or the switch would sink the current before an insulated ground path at the patch panel would have much effect?

It does seem very strange that the wiring would be damaged and not the NICs in the computers, but it appears to be the case here. Several of the computers in the office have UPS batteries but they are cheap little things that only provide a couple of minutes of battery power. The server, on the other hand, has a battery that can keep it running for as long as 30 minutes if it is not under heavy load. I wasn't actually in the building during the outage but my theory is that the power surge that caused the damage happened when the power came back on after being off for about 20 minutes. Since the server was the only machine actually on at that time, it was the only available destination for the overload to go (or possibly where it started?) which is why the NIC in the server died along with the cabling.

Emulex · May 1, 2010

you sure you didn't have a single device that wasn't surge protected - think laserjet - maybe that powerstrip wasn't a surge suppressor - let a spike get into the network?

also floating grounds - check every UPS and power strip for ground faults - you can cause all heck to break loose if you have a single ground fault

RebateMonger · May 1, 2010

Fardringle said:
Since the server was the only machine actually on at that time, it was the only available destination for the overload to go (or possibly where it started?) which is why the NIC in the server died along with the cabling.

But even if the PCs were off, would't they still be grounded and could serve to sink some of the current?

imagoon · May 2, 2010

RebateMonger said:
But even if the PCs were off, would't they still be grounded and could serve to sink some of the current?

Yes, assuming the ground pin was not cut or the ground from the outlet to ground rod / panel neutral didn't suck.

imagoon · May 2, 2010

RebateMonger said:
I very seldom do any wiring, and I'm curious what the standard method of grounding a patch panel would be? I don't see any grounding lugs on the ones I'm looking at on Newegg (Tripplite and others). I've even seen patch panels with plastic mounting standoffs. Or are we only talking about higher-end patch panels here?

And I hate to sound too ignorant here, but is this the theory (?):

1) A bunch of PCs got hit with a large potential difference between their NICs and the switch at the other end. This wiped out a bunch of cabling.

or

2) The switch got hit with a large potential difference between it and the PCs' NICs, again damaging the cabling in between?

In either case, wouldn't it be more likely that the PCs or the switch would sink the current before an insulated ground path at the patch penel would have much effect?

(I'm not being critical here. I'm one of those moderately ignorant people people when it comes to the effect of power transients).

The key thing about data cable is that it is a very small gauge. During a surge, any amount of current or voltage can be generated that may damage the copper itself of melt the insulation. Data cable insulation is normally pretty thin and the a spike in heat from the surge could melt the jacket, shorting it, cause an open, or crack the copper in the wire (adds a ton of noise and harmonics.)

The cable tends to have a "maximum" load of 15.4 watts (PoE standard.) The allow for ~16% loss in the cable as is. PoE operates at 44VDC@ 350ma. A surge can generate varying levels of current / volts which could quickly burn the cable.

Fardringle · May 4, 2010

To make the situation even more frustrating, I got the new cable from Monoprice today and pulled new cable to the six stations that were not working at all. There is no visible damage to the old wires that I can find but the computers connect right away and no longer get horrendous packet loss with the new wires. However, it's still not working perfectly. Although it appears that the cables were definitely bad for these stations, I'm still getting intermittent packet loss (the same 1-3% that the other machines were seeing) even with the new wires, and I have been getting reports (before and after I ran the new wires) that even some of the previously "perfect" stations have started seeing some connectivity issues as well.

So....some of the wires definitely failed during the power outage/surge and I have replaced the ones that I am sure of, but there is still something happening to cause small amounts of packet loss on the majority of the machines in the office. It's not more than 1% or 2% on any of the computers now, but that is enough to make their database program throw a fit.

The electrician with the Fluke scanner was supposed to be there on Friday afternoon but never showed up and didn't answer calls today so I'm back on the hunt for a scanner that I can obtain locally. The boss said no to the free trial and to renting from an online source because he doesn't want to deal with shipping (even though I'm the one that would be doing it) so I get to make lots of calls in the morning...

RebateMonger · May 4, 2010

Bummer.

Did you ever say if this is Gigabit or is it 100 Mbps Ethenet?

Fardringle · May 4, 2010

Some of the machines have Gigabit cards but all stations connect at 100 Mbps (no Gigabit switches).

RebateMonger · May 4, 2010

So what happens if you cross-connect pairs of computers with new patch cables? Just two computers hooked together with a trusted switch and a couple of new cables? Do you still get packet losses?

Maybe you've got a NIC or two that are misbehaving and messing up the whole network?

Multiple cable drops fail at the same time

Diamond Member

Senior member

Diamond Member

Elite Member

Diamond Member

Diamond Member

No Lifer

Diamond Member

No Lifer

Diamond Member

No Lifer

Diamond Member

Diamond Member

Diamond Member

Senior member

Elite Member

Diamond Member

Diamond Member

Elite Member

Diamond Member

Diamond Member

Diamond Member

Elite Member

Diamond Member

Elite Member