Troubleshooting a T1 line

Boscoh

Senior member
Jan 23, 2002
501
0
0
We've been having a problem for a while now with one of our T1 circuits. It will drop repeatedly through the day for anywhere from 30 seconds to 15 minutes. The location is way out in BF Egypt and we dont have anyone at the site capable of doing physical troubleshooting. So I've been limited by what I can do remotely, and by calling the telco out to test up to the smartjack.

I have verified the configuration on the 1720 is correct, that the site is not having a power outage during the time the circuit goes down, and have not been able to correlate the bouncing to any type of weather pattern (rain, high wind, etc). When the circuit bounces, I will always get at least a Loss of Frame and Loss of Signal on the router. Every now and then I will get an AIS and/or Remote Alarm. I have never gotten a module access error. I do get lots of carrier and interface resets in the interface stats as well.

Believing that all was well on the router from what I was able to do from a few hundred miles away, I have called the telco out repeatedly. Each time they do an extended test, they find a new "problem" and "fix it" and close the case. The first time, they had to replace the T1 pair at their CO, the second time it was a misconfiguration on their frame switch, the third time the problem was with their main span and they moved us to a backup only to move us back to the main when the problem was "fixed", the fourth time (last week) there was a grounding problem with the cable. After this last fix, the circuit is still bouncing. I am not sure what battery of tests they have run. SBC is having to act as the proxy for information from the local telco, and they never have details other than what the tech sent in which is usually not much.

Tomorrow I plan to go out to the location and do a software and hardware loopback test on the router, and test/replace the cable from the router to their box. The person at the location has said he does not see any evidence of any chewing on our cabling that might come from a rat or something inside the location. From my knowledge, that is about as much as I can do on my end. If all that checks out, and is ok then the problem lies with the telco, correct?

Am I missing any important steps here? Anything else I should be demanding of the telco? They have already violated their five-nines SLA. They're at about 92% uptime so far this year.

Any words of experience you guys could provide would be appreciated.
 

ScottMac

Moderator<br>Networking<br>Elite member
Mar 19, 2001
5,471
2
0
I have a few possibilities:

The cable from the SmartJack to the router / CSU (the "Extended Demarc"). In addition to checking it for condition and quality (no knots, kinks, crushes, stretches, or cable jack pulled from the connector), verify the length if it is a long run. While the spec for T1 is ~650 ft (extended demarc), if it's being run over Category-rated UTP, the length is significantly reduced. Standard for T1 is "premises cable": individually shielded pair with an overall shielded jacket.

Verify that the router is taking timing from the network (the LINE, the T1), not internal (unless the Telco SPECIFICALLY says you should be generating your own clock). The bouncing may be "slippage" caused by multiple / conflicting clock sources. Clock slips generally take the form of a more regular bounce, ranging from minutes to days or weeks ... depending on the disparity of the clocks. IF you have been directed to gen your own clock, it should only be set on one device of the pair (a single clock source per line).

Also check that the router and SmartJack are well ventilated and clear of obstruction. IF either are overheating, that can cause them to shutdown and / or reboot (router reboots, not the smartjack).

You may be able to arrange a dispatch (tech onsite) while you are there. Open a ticket, explain that it's a chronic problem; they should be willing to work with you to have the resources in-place to fully test and fix the issue.

If you push the issue as chronic, you may be able to get them to "Class A" test the circuit (essentially replace and verify all of the equipment from end-to-end).

It sounds like it's probably a network issue, but there are frequently problems like this caused by the extended demarc and timing.

Good Luck

Scott
 

Boscoh

Senior member
Jan 23, 2002
501
0
0
The run from the smartjack to the CSU is only 20ft max. The cable is telco-grade premises cable. I dont have any of that, so if I end up replacing the cable it will be UTP, but I wouldnt see a problem with a 20ft run.

We are not generating our own clocking.

The router might be getting hot inside the cabinet that it's in. I'll have to check that out. I know it's not rebooting, because the uptime last week (before I manually did a restart) was a few weeks. I've never had an issue with that router rebooting or shutting down.

I'm on my way out there right now, I'll let you guys know what I find out.

Thanks.
 

spidey07

No Lifer
Aug 4, 2000
65,469
5
76
sounds like a flaky T1.

problems like these can linger for months until the telco finally rebuilds the circuit on new equipment, new pairs.

It frequently happens in BFE type locations (bum fcked egypt) over old cabling.

If the telco is actually taking errors when testing to the smart jack then its their problem. Tell them to loop your CSU and run the tests again.

Verfify provisioning (linecode, clock, framing) and verify again with telco. sometimes they'll tell you its one thing and the circuit is provisioned another.
 

Boscoh

Senior member
Jan 23, 2002
501
0
0
Alright, well I think I've ruled out our router or cabling as being the cause. Things I did:

1) Check the cabling from the router to smartjack - free of defects and pinouts were correct.
2) Hardware loopback test (about 30min worth using extended pings of about 5,000 packets each test with data 0x0000, 0x0001, 0x0101, 0x5555, and 0xffff - all the tests came back clean with no dropped packets and no interface errors)
3) Extended ping while the interface was up instead of looped - clean
4) Verified the router config settings against what the telco provided me

So I guess I am going to just have to demand that the telco go down the line and start replacing equipment/cabling until they fix this, or we will just take our business elsewhere. IPSec over satellite anyone?
 
Jul 14, 2004
109
0
0
You said "pair", as is singular. Is this "T1" being carried on only one pair? How far are you from the serving central office? It the distance is far and you are on an HDSL2 they may consider a redesign to HDSL4.
 

Bob151

Senior member
Apr 13, 2000
857
0
0
Yea, if they keep finding problems between the SJ, why bother going to the site? Some circuits are just cronic for years.

Squeaking wheel gets the grease, keep calling the telco.