Gigabit Network

MikeDub83

Member
Apr 6, 2003
96
0
0
I run a Counter-Strike server at my school in Boston (Wentworth). The problem is that the pings are starting to rise and I'm trying my best to deal with them. The average ping (latency in Counter-Strike) is 25 - 46ms. My main suspicion is that it's our network administrators who have recently installed new switches that filter outgoing packets. I want to make sure I'm as fast as possible on my end and was thinking about solving it the Gigabit way. First a little background on the network...

All the dorm buildings are connected via fiber. Five 100Mb switches serve each buidling. So approximately 100 students per 100Mb switch. Each student has one or two 100Mb lan drops in their room.

In an effert to increase my bandwidth I installed several NICs into my server and wrote a program to load balance the NICs (for my website). Each NIC was only capable of pushing 7 or 8 megabytes per second a piece. So I set up an experiment from a computer lab and had several computers downloading files. The server was able to push the files at approximately 13.5MB/s!

It seems to me that my bottleneck is the network cards themselves. Therefore, I would like to install a small Gigabit LAN in my dorm room. The theory behind this is that it will be able to saturate the upstream switch with the maximum bandwidth possible. For those concerned about the server specs, check the bottom of the post.

I don't want to spend a ton on this so a managed switch is out of the question. I think I want to go with the Netgear GS104 or GS108. Since the server board can support a 64bit card I was thinking the Netgear GA622T could fit the bill nicely.

Does anyone have any ideas of what I should do different? Or any performance related changes I could make to linux or the NIC drivers? Thanks for the help.

- Mike


-= Server -=
Dual Athlon MPs @ 1.4GHz
Tyan Tiger MP Mobo
256MB RAM, (Soon to be 512MB)
IBM Deskstar 40GB HDD
Red Hat Linux 8
 

Oaf357

Senior member
Sep 2, 2001
956
0
0
The NICs aren't your problem.

Other network traffic on the 100 Mbps feed is probably a good reason and your hardware itself (PCI bus, HDDs, etc.) a gigabit network in your room won't do you any good because once you leave your network its 100 Mbps.

But if you wanted to decrease your latency to your server (assuming your CS client is on a small LAN in your room) gigabit might help but seems like a hell of a waste for a few milliseconds.

Maybe a SCSI RAID array would help. A new mobo with 66 MHz PCI bus capabilities (and 66MHz NICs) might help. These solutions aren't cheap.

It seems like you're taxing the network as much as possible with your current hardware. But a GIGe LAN won't help much when you put dollars to throughput.
 

alrox

Member
Nov 17, 2002
175
0
0
high pings are usually a problem caused by wan links and the highly variable loads placed on them, nothing to do with your local switched network, which is probably not stressed to begin with. Each half-life player will use about 56k per second, not straining a 100meg uplink at all, even if your server is full. Plus the switch on the campus backbone that you are plugging into is 100megabit as well, your gige nic would negotiate down to 100/full like your current one.

Also, your little 'load balancing' experiment was showing off how windows will allow smb shares to be accessed across different nics, not for all IP activity. Your computer will only use ONE nic to pass packets to the default gateway.

Traceroute will be helpful in identifying where exactly on the internet things are slowing down.

*edit*
reread the note and said the 'load balancing' was done via http with a program you wrote. You might want to expand on this a little more.


 

MikeDub83

Member
Apr 6, 2003
96
0
0
Also, your little 'load balancing' experiment was showing off how windows will allow smb shares to be accessed across different nics, not for all IP activity. Your computer will only use ONE nic to pass packets to the default gateway.

The load balancing was a perl script that rotated IPs and forwarded the user to a different IP. The files were uploaded via Apache. I would call that 'load balancing'.

I considered upgrading the hard disk system, but I don't think that would matter. The 'load balancing' experiment showed the server could pull large files from disk and push it pretty quickly. The half life engine should be in memory, so hard drive hits would only happen during map changes.

How much would going to gigabit drop latency? You said a couple milliseconds. Is that true?
 

alrox

Member
Nov 17, 2002
175
0
0
What does your CPU utilization look like when the server is full and ping times start to rise? Gigabit would not drop latency for hl at all, even if your backbone network was all gige(which it isn't).

Are just people on the university network playing, or is it internet people as well? If you have CPU usage to spare, then I'm not sure what's causing the server to bog down.

*edit*

Online games are designed to use very little bandwidth in the first place. 1 player=56kilobits/second maximum, in both directions.
A full 32 player server, that works out to 1792k up, 1792k down per second. Bearly anything for a full duplex 100megabit connection.
 

Oaf357

Senior member
Sep 2, 2001
956
0
0
Originally posted by: MikeDub83
Also, your little 'load balancing' experiment was showing off how windows will allow smb shares to be accessed across different nics, not for all IP activity. Your computer will only use ONE nic to pass packets to the default gateway.

The load balancing was a perl script that rotated IPs and forwarded the user to a different IP. The files were uploaded via Apache. I would call that 'load balancing'.

I considered upgrading the hard disk system, but I don't think that would matter. The 'load balancing' experiment showed the server could pull large files from disk and push it pretty quickly. The half life engine should be in memory, so hard drive hits would only happen during map changes.

How much would going to gigabit drop latency? You said a couple milliseconds. Is that true?

A couple milliseconds, locally. Your locally originating and locally terminating (meaning in your gigabit LAN) traffic would move significantly faster. If you have a lot of traffic moving about your "room LAN" then it might help a tad with ping times but like the earlier poster mentioned ping times are highly dependent upon upstream Internet connectivity.

I am not recommending you move to gigabit. But the incredibly minor improvements (not worth the expense) are somewhat possible.
 

Oaf357

Senior member
Sep 2, 2001
956
0
0
By the way. I'd be interested in seeing this script in action. PM me with your web address, please.
 

MikeDub83

Member
Apr 6, 2003
96
0
0
I think the major problem might be our network admins...
Last Fall we had a 20 player game going playing Bloodstrike (one of the most CPU taxing maps because of bullet calculation), everyone had a ping of under 10. Over winter break the admins installed new switches that would allow them to block ports and in some cases even bandwidth shape your connection. They claim they are not shaping connections but I have my doubts.

One of the biggest problems is that approximately every 2 seconds there is a massive spike in latency. This essentially causes people to move choppy. Any ideas on what the admins could have done?
 

alrox

Member
Nov 17, 2002
175
0
0
If those new routers are configured correctly with nothing stupid like a duplex mismatch, they might be doing some qos.

It's possible they're giving a lower priority to UDP traffic(hl, plus 99% of other online games). For instance, if someone made a TCP connection(web site hit, whatever), your UDP packet takes a back seat.
 

Oaf357

Senior member
Sep 2, 2001
956
0
0
If shaping is possible then chances are they're doing it. That is probably your problem.
 

MikeDub83

Member
Apr 6, 2003
96
0
0
By the way. I'd be interested in seeing this script in action. PM me with your web address, please.

Sorry, I don't have it anymore. It was real simple though. The three IPs were in an array. When someone hit the program it would read a small file that stored the last used IP. Then the script would use the next IP in the array and do a HTTP forward with the new IP in the address. Then the script would store the last used IP address to the file. Apache in its default configuration responds to every IP address. Pretty simple but worked like a charm ;o)
 

Oaf357

Senior member
Sep 2, 2001
956
0
0
Originally posted by: alrox
If those new routers are configured correctly with nothing stupid like a duplex mismatch, they might be doing some qos.

It's possible they're giving a lower priority to UDP traffic(hl, plus 99% of other online games). For instance, if someone made a TCP connection(web site hit, whatever), your UDP packet takes a back seat.

This is also highly likely. But UDP traffic isn't usually put at the bottom of the stack (thanks to DNS).
 

Oaf357

Senior member
Sep 2, 2001
956
0
0
Originally posted by: MikeDub83
By the way. I'd be interested in seeing this script in action. PM me with your web address, please.

Sorry, I don't have it anymore. It was real simple though. The three IPs were in an array. When someone hit the program it would read a small file that stored the last used IP. Then the script would use the next IP in the array and do a HTTP forward with the new IP in the address. Then the script would store the last used IP address to the file. Apache in its default configuration responds to every IP address. Pretty simple but worked like a charm ;o)

Nice.
 

spidey07

No Lifer
Aug 4, 2000
65,469
5
76
One of the biggest problems is that approximately every 2 seconds there is a massive spike in latency

Is it measured every two seconds? If so then that sounds like spanning-tree. It could just be a poorly designed network and you've got a ton of broadcasts to deal with.
 

MikeDub83

Member
Apr 6, 2003
96
0
0
What exactly is spanning-tree?

Broadcasts used to be a problem, particularly with Windows file services. Therfore, most broadcasts have been stopped at the source by the new switches.
 

spidey07

No Lifer
Aug 4, 2000
65,469
5
76
Actually broadcasts are your biggest concern in switch networks. With hundreds of nodes on the same network and windows machines you can easily bog a network down.

Spanning-tree is a loop detection algorithm used by most all switches/bridges. Every single port on a bridge/switch sends out BDPU (bride data protcol units I think) every two seconds. Another possibility is they have an unstable bridge that is sending out topology change notifications and causing the switch to recalculate its paths.

Try taking the game out of it and ping your DNS server or default gateway a few thousand times and see what the responses are like. I do know that the latency in counterstike has a lot more to do with sever side processing than actual network latency.

Also if there is a router between you and the server there could be a small amount of latency involved.

But really all I'm saying is lets get the symptoms nailed down first.
 

Garion

Platinum Member
Apr 23, 2001
2,331
7
81
OK, I'll throw my two cents in here. Several comments.

Going gigabit isn't going to help you much. Think of it this way - Your 100BaseT connection to the outside world is a garden hose. Connecting a fire hose to it from your room isn't going to make it any fatter. If you local PC can't push the 100BaseT hard enough, it won't push a gig link, either. In any case, I strongly doubt that your problem lies in your connection to the network, anyhow. Counterstrike isn't about raw bandwidth, it's more about latency. Gigabit just gets you more bandwidth and would have almost zero effect on latency.

Go out and grab Ping Plotter. It's a great utility for troubleshooting this kind of thing. It does a traceroute to the destination then pings each of the hops between you and the destination every few seconds to allow you t look at latency across the various links.

FYI, I agree with Spidey - Good money says you've got a problem on the LAN. Some rocket scientist probably plugged in both 100BaseT ports in their room to the same switch, thinking it would double their uplink. Most switches can be configured to guard against this, but who knows what your sysadmins did to them.

- G
 

spidey07

No Lifer
Aug 4, 2000
65,469
5
76
FYI Garion,

Yeah I've seen some of this. The little soho switches do NOT run spanning-tree so if you plug two ports of a switch into one of the linksys switches then you have a port to port loop. It can cause some problems but if the switch is configured to guard against this then no biggie. But in my experience most people don't know how to setup a switched network.

-edit- see signature. :)
 

Garion

Platinum Member
Apr 23, 2001
2,331
7
81
Yeah, it's a bad situation. Too many people enable the "host port" settings (no spanning tree/FEC/trunk detection) without turning on tbe BPDU guard features. Been there, done that, learned that lesson the hard way. :eek:

- G
 

MikeDub83

Member
Apr 6, 2003
96
0
0
Thanks for all the great information.

I tried Ping Plotter and came up with some very interesting results. I traced an IP that is located in a different dorm. The gateway has an average responce of 15ms with a minimum of 0 and a max of 848ms. It also has a packet loss percentage of 7%. I tested every 2.5s for 1 hour. Is this bad?
 

spidey07

No Lifer
Aug 4, 2000
65,469
5
76
That's bad. Sounds like some serious LAN issues.

Next question - were you pinging an IP address that was in the same IP subnet as you?
Obvious question/suggestion - change the network cable to a known working good one and try to see if it is your port or your computer or your cable.

On a LAN you should have zero packet loss and latency under 10 ms (normally 1 ms)

-edit- reread your post. Sounds like you were pinging another machine on a different subnet. Try the same test with a machine on your floor. If it happens on your floor on the same IP subnet then there is some kind of layer2/switch problem or layer1/physcial cabling problem. If you get flawless results on your same LAN then your university has a router/trunk problem.
 

MikeDub83

Member
Apr 6, 2003
96
0
0
Just got an email, the network is going down tomorrow morning for "routine maintenance."

Yes, the machine was in a different subnet (172.21.8.251). My IP address is 172.24.5.73, the gateway's address is 172.24.0.1.

The answers that I supplied were for the gateway's reponce. The percentage loss from 172.21.8.251 was 14%.

It looks like the other subnet is suffering from about the same amount of packet loss to it's gateway as well.

When I ping machines that are in my subnet their response time is always 0ms.
 

spidey07

No Lifer
Aug 4, 2000
65,469
5
76
My IP address is 172.24.5.73, the gateway's address is 172.24.0.1.

Dear lord!!!!!!!!!!!!!!!!!!! Please tell me your subnet mask is not 255.255.0.0 or 255.255.248.0.

If it is and even if it isn't that network has serious problems with addressing like that.

I'm sure the network guys there are wondering "why is everything so slow, why is half of our traffic broadcast traffic, and why are our router processors so pegged?"

:)