Hey guys,
I've recently discovered a strange problem on Comcast's network which I thought you guys might like to chew on. What I'm looking for is some good theories as to what the cause might be. This is an on-going problem so if anyone can think of some good tests to run, I could do it.
Summary of the problem:
I first noticed the problem a couple weeks ago while trying to play a game via a direct IP connection with a friend who is also on the Comcast network. The problem also coincides with an IP change on my friend's node. The new IP he received begins with a different octet than what he used to get which leads me to believe Comcast has been doing some re-allocation of IP space. Running some ping tests shows that I'm getting massive packet loss to my friend. I get anywhere ranging from 25% to 75% loss (keep reading, I promise this will get more interesing). We're located about about 5 miles apart physically and our IP's are similliar. Mine is currently (until the DHCP lease expires) xx.xx.222.138 and his is xx.xx.3.29 (the first 2 octets are the same) both with a /21 subnet mask (255.255.248.0).
Okay, at this point some of you are probably thinking:
1) Connection problem. Either your friend's or your connection is dropping packets.
Nope. Both our connections are fine to other places on the internet (mostly port traffic). I can ping from outside of Comcast's network to our connections and recieve 99%-100% response.
2) Must be some overloaded circuit between us
Doubtful. The problem has been there for 2 weeks and it's been there whenever I've had the chance to test it. Highly doubtful that a circuit is overloaded for that period of time between 2 nodes that close together.
I've also done some ping tests to neighboring subnets and found that I can ping anyone in my subnet (xx.xx.216.0/21) without problems. I can also ping a lot of other nodes on different subnets (xx.xx.144.0/21 for example). I'll have to do some more testing to be sure, but it seems when I get down into the 50's (maybe 60's) the problem starts to appear. I haven't done a complete mapping of the entire B class, but the fact that the problem is only with certain subnets is probably significant and could point to subnet mask problem?
I guess the easiest explanation is there's some misconfiguration in one of the routers between our connection. That's obvious, but what kind of problem could cause these symptoms? If it's a subnet mask or ACL problem, shouldn't all traffic be getting dropped? What's the explanation for the partial drops?
Unfortunately I can't trace the route between our connections as Comcast has blocked it. However, one would assume there can't be more than 1 or 2 hops between us.
Does anyone have any insight into how cable operators setup their networks? Do they use ATM to backhaul the traffic before it is handed off to their IP network? I know Covad does this with their DSL service (or at least they did 2 years ago)
And I guess one thing I should confirm is if this loss happens with all traffic types. I'm currently assuming it happens to all traffic...should probably verify it.
I've recently discovered a strange problem on Comcast's network which I thought you guys might like to chew on. What I'm looking for is some good theories as to what the cause might be. This is an on-going problem so if anyone can think of some good tests to run, I could do it.
Summary of the problem:
I first noticed the problem a couple weeks ago while trying to play a game via a direct IP connection with a friend who is also on the Comcast network. The problem also coincides with an IP change on my friend's node. The new IP he received begins with a different octet than what he used to get which leads me to believe Comcast has been doing some re-allocation of IP space. Running some ping tests shows that I'm getting massive packet loss to my friend. I get anywhere ranging from 25% to 75% loss (keep reading, I promise this will get more interesing). We're located about about 5 miles apart physically and our IP's are similliar. Mine is currently (until the DHCP lease expires) xx.xx.222.138 and his is xx.xx.3.29 (the first 2 octets are the same) both with a /21 subnet mask (255.255.248.0).
Okay, at this point some of you are probably thinking:
1) Connection problem. Either your friend's or your connection is dropping packets.
Nope. Both our connections are fine to other places on the internet (mostly port traffic). I can ping from outside of Comcast's network to our connections and recieve 99%-100% response.
2) Must be some overloaded circuit between us
Doubtful. The problem has been there for 2 weeks and it's been there whenever I've had the chance to test it. Highly doubtful that a circuit is overloaded for that period of time between 2 nodes that close together.
I've also done some ping tests to neighboring subnets and found that I can ping anyone in my subnet (xx.xx.216.0/21) without problems. I can also ping a lot of other nodes on different subnets (xx.xx.144.0/21 for example). I'll have to do some more testing to be sure, but it seems when I get down into the 50's (maybe 60's) the problem starts to appear. I haven't done a complete mapping of the entire B class, but the fact that the problem is only with certain subnets is probably significant and could point to subnet mask problem?
I guess the easiest explanation is there's some misconfiguration in one of the routers between our connection. That's obvious, but what kind of problem could cause these symptoms? If it's a subnet mask or ACL problem, shouldn't all traffic be getting dropped? What's the explanation for the partial drops?
Unfortunately I can't trace the route between our connections as Comcast has blocked it. However, one would assume there can't be more than 1 or 2 hops between us.
Does anyone have any insight into how cable operators setup their networks? Do they use ATM to backhaul the traffic before it is handed off to their IP network? I know Covad does this with their DSL service (or at least they did 2 years ago)
And I guess one thing I should confirm is if this loss happens with all traffic types. I'm currently assuming it happens to all traffic...should probably verify it.