Network performance and Windows File Sharing Question

Mucman

Diamond Member
Oct 10, 1999
7,246
1
0
Hey folks I have a question that can't really be solved by the "try it and see how it works" method.

At work we got 9 web-servers. All are running IIS, has from 128-400 IPs allocated for each machine. Websites are run in an independent
service in IIS. Each IP dumps it's weblogs in a folder on the server it's running on (ie E:\logs\www\ip_addr\w3svc\*.log ). The E:\logs\www\
folder is shared on the private network so out statistics server can read them. We recently migrated to Urchin (fantastic product IMHO, and
actually works unlike Deepmetrix's offerings). We currently have it set for the stats server to process the logs daily. This process takes around
2-3 hours to complete (logs for ~1400 websites). I am confident that part of the slowness is due to accessing the logs over Windows File Sharing (WFS).

Urchin can read logs over FTP but unfortunately doesn't take wildcard characters and all of our logs are timestamped, so each day the log files have
a different name.

At the moment, I am considering the following solutions :

A - Leave it how it is... who cares if it takes 3 hours to process the logs. It starts at 2am in the morning and will be done before everyone has
has their first cup of coffee.
PROS - no work for me
CONS - What if we have 3000 or 6000 websites? The solution might work now, but what about in the future.

B - Create a share on the stats server, and configure the web servers to dump their log files over the WFS
PROS - Urchin should be able to import the logs much faster (no WFS overhead). Centralized location for logs files, can simply other things. ie log file
backups.
CONS - Have to write a script to change the log location for every website. Will generate a lot of traffic over the private network.

C - Write a script that will copy all the log files from the web servers to the stats server once a day.
PROS - simple, can be done using a basic batch script. Only happens once a day so it should not affect the private network performance very much.
CONS - Have to coordinate log_rotators and the copy script to minimize potential data (log file) loss.

I am still pondering this, but I would love to hear some feedback on this... if you don't mind, share your statistics software setups...

thanks
 

Mucman

Diamond Member
Oct 10, 1999
7,246
1
0
Well B is definitely out... MS doesn't reccomend putting the logs files on the network... I also found out another reason why the imports were so slow (2 reasons).

1 - I had the dnscache misconfigured (I did not added the local IP to the /etc/dnscachex/root/IP directory) so everything took 5 seconds to resolve (query got passed to the next nameserver)
2 - We had Urchin set to try 8 times for every unresolvable query :eek:. On Wednesday I will find out how the improved the import time.

btw, we went with option C for now.
 

ScottMac

Moderator<br>Networking<br>Elite member
Mar 19, 2001
5,471
2
0
Not really my end of the stick, but Option C seems to make the most sense to me overall.

FWIW

Scott
 

Woodie

Platinum Member
Mar 27, 2001
2,747
0
0
Option C in use here. (~60 IIS servers, some QA some Production).
Took some time to get the architecture set, but it also allowed us to copy back application and server log files back, not just the IIS logs.
Nightly job, does take a significant amount of time to process, and the stats are usually 2 days old, because of the timing. We wrote the scripts with failure in mind, so they copy (and purge) all log files, not just the current day.

If network performance is an issue, have the batch file zip (compress) the logfiles before transmitting them. Since these are text files, you should get >90% compression. Then you can use a matching batch file to decompress them into the appropriate directories, and kick off the "get stats" job.
 

Mucman

Diamond Member
Oct 10, 1999
7,246
1
0
Thanks Woodie!

It's a nightly job for us too. It takes about 30-40 minutes for all the log files to be copied over, and about 2 hours for the stats serve to parse all the logs (would probably be a heck of a lot
faster without the reverse DNS lookups though). I might try using compression to see how much faster it will go...