We've been having a problem for a while now with one of our T1 circuits. It will drop repeatedly through the day for anywhere from 30 seconds to 15 minutes. The location is way out in BF Egypt and we dont have anyone at the site capable of doing physical troubleshooting. So I've been limited by what I can do remotely, and by calling the telco out to test up to the smartjack.
I have verified the configuration on the 1720 is correct, that the site is not having a power outage during the time the circuit goes down, and have not been able to correlate the bouncing to any type of weather pattern (rain, high wind, etc). When the circuit bounces, I will always get at least a Loss of Frame and Loss of Signal on the router. Every now and then I will get an AIS and/or Remote Alarm. I have never gotten a module access error. I do get lots of carrier and interface resets in the interface stats as well.
Believing that all was well on the router from what I was able to do from a few hundred miles away, I have called the telco out repeatedly. Each time they do an extended test, they find a new "problem" and "fix it" and close the case. The first time, they had to replace the T1 pair at their CO, the second time it was a misconfiguration on their frame switch, the third time the problem was with their main span and they moved us to a backup only to move us back to the main when the problem was "fixed", the fourth time (last week) there was a grounding problem with the cable. After this last fix, the circuit is still bouncing. I am not sure what battery of tests they have run. SBC is having to act as the proxy for information from the local telco, and they never have details other than what the tech sent in which is usually not much.
Tomorrow I plan to go out to the location and do a software and hardware loopback test on the router, and test/replace the cable from the router to their box. The person at the location has said he does not see any evidence of any chewing on our cabling that might come from a rat or something inside the location. From my knowledge, that is about as much as I can do on my end. If all that checks out, and is ok then the problem lies with the telco, correct?
Am I missing any important steps here? Anything else I should be demanding of the telco? They have already violated their five-nines SLA. They're at about 92% uptime so far this year.
Any words of experience you guys could provide would be appreciated.
I have verified the configuration on the 1720 is correct, that the site is not having a power outage during the time the circuit goes down, and have not been able to correlate the bouncing to any type of weather pattern (rain, high wind, etc). When the circuit bounces, I will always get at least a Loss of Frame and Loss of Signal on the router. Every now and then I will get an AIS and/or Remote Alarm. I have never gotten a module access error. I do get lots of carrier and interface resets in the interface stats as well.
Believing that all was well on the router from what I was able to do from a few hundred miles away, I have called the telco out repeatedly. Each time they do an extended test, they find a new "problem" and "fix it" and close the case. The first time, they had to replace the T1 pair at their CO, the second time it was a misconfiguration on their frame switch, the third time the problem was with their main span and they moved us to a backup only to move us back to the main when the problem was "fixed", the fourth time (last week) there was a grounding problem with the cable. After this last fix, the circuit is still bouncing. I am not sure what battery of tests they have run. SBC is having to act as the proxy for information from the local telco, and they never have details other than what the tech sent in which is usually not much.
Tomorrow I plan to go out to the location and do a software and hardware loopback test on the router, and test/replace the cable from the router to their box. The person at the location has said he does not see any evidence of any chewing on our cabling that might come from a rat or something inside the location. From my knowledge, that is about as much as I can do on my end. If all that checks out, and is ok then the problem lies with the telco, correct?
Am I missing any important steps here? Anything else I should be demanding of the telco? They have already violated their five-nines SLA. They're at about 92% uptime so far this year.
Any words of experience you guys could provide would be appreciated.