Geoff's Stats - DONE -= GOING WITH SCSI =-

GeoffS

Lifer
Oct 10, 1999
11,583
0
71
Thanks everyone for your opinions! :)



Here's my thinking... I don't need ton's of disk space. I'm thinking of an older SCSI RAID card that does RAID 0,1,0+1 and maybe 5... haven't decided yet what level of security I need. Now... Since I don't need lots of hs space, it seems to me that 2 or more 30g drives are really overkill spacewise and will be empty for the most part... I was thinking something like 4x 9g SCSI drives because they are cheap, easy to find, and cheap to replace in the case of a failure.

Does this make sense? Is this the best approach?
 

Freewolf

Diamond Member
Feb 15, 2001
9,673
1
81
Will the 4x 9 gig scsi drives give you room for growth because i have a feeling you will be doing more than you already are ?
 

BlackMountainCow

Diamond Member
May 28, 2003
5,759
0
0
What Freewolf said. I guess more and more DC ppl from other teams and also freelancers will use your stats Geoff. For example, in the latest DPAD stats crash, Stephen used YOUR stats as a reference to reestablish the right scores for screwed users! So plan ahead! :D
 

GeoffS

Lifer
Oct 10, 1999
11,583
0
71
Originally posted by: BlackMountainCow
What Freewolf said. I guess more and more DC ppl from other teams and also freelancers will use your stats Geoff. For example, in the latest DPAD stats crash, Stephen used YOUR stats as a reference to reestablish the right scores for screwed users! So plan ahead! :D

Wow... nice of him to ask! ;)
 

compudog

Diamond Member
Apr 25, 2001
5,782
0
71
Geoff, I have an Adaptec AHA-2940UW and two WD WDE9150 9 gig drives you can have if you like. They are running a dedicated DPAD machine right now, but I have a spare IDE I can put in the box. I don't have the drives in RAID, but I believe that card can do RAID.

Anyone correct me if I am wrong. Maybe only software RAID. Not sure.
 

Stormgiant

Senior member
Oct 25, 1999
829
0
0
An ATA RAID with say 2x40GB ATA133 7200 will be more than enough for the task.

Later on, if needed, you can always swap those with 2x80GB or whatever...

ATA RAID's are very good and fast... for simple taks like these :)
 

Freewolf

Diamond Member
Feb 15, 2001
9,673
1
81
Originally posted by: compudog
Geoff, I have an Adaptec AHA-2940UW and two WD WDE9150 9 gig drives you can have if you like. They are running a dedicated DPAD machine right now, but I have a spare IDE I can put in the box. I don't have the drives in RAID, but I believe that card can do RAID.

Anyone correct me if I am wrong. Maybe only software RAID. Not sure.

scsi sounds good to me and should be easy to find more of those drives
:)
 

ProviaFan

Lifer
Mar 17, 2001
14,993
1
0
If you're going for speed, then the old 9GB SCSI drives may not be able to keep pace with modern 7200 or 10000RPM drives in performance. But if you can indeed get a 4x 9GB SCSI RAID array up for less than a 2x 80GB ATA array, then by all means, do what suits you best. :)

BTW, I'm sure you know this already, but for the uninitated: RAID 0 is what we want least; redundancy is not a nice extra, it is a requirement. Ergo, we can defenestrate the RAID 0 right from the start.
 

kloostec

Senior member
Sep 19, 2003
272
0
76
defenestrate.. heh ;)

You'll get best performance out of 10000k or 15000k 9GB SCSI drives, but they're expensive. SATA's got the second best performance, followed by IDE.

The nicest thing about SCSI is that it's hot pluggable if you have a good controller. If a drive fails, you just unplug it, and put in another one... the thing starts rebuilding the array without any downtime... the OS doesn't even get involved. I don't think this is the case with SATA, and I know it's not with IDE.

IDE and SATA drives tend to be of lower quality than SCSI drives, as IDE and SATA are meant for consumer devices, not servers (with the exception of the WD Raptor, which I'd term a workstation drive). You're more likely to have a failure with an IDE or SATA drive, but they're also cheaper to replace.

If you're going to have immediate access to the server, you probably won't need a hot spare drive (a drive which has the sole purpose of taking over when another drive fails, so that you get a redundant array back up without having to intervene). If you're any distance away from the server, or it'll take you a couple of weeks to get there, or whatever, then you may want to consider this... If two drives die, then you lose all of your data unless you've got that hot spare.

If you need loads of space, the only way to go is IDE. If you want the best reliability, and the best speed, but at a higher cost, go with SCSI. The middle ground is SATA.

If you managed to get your hands on 4x 9 gig SCSI drives, I'd probably set it up as a three drive RAID 5 with hot spare. You'll get 18 gigs of space, which should be enough for your database, plus you'll be able to have two drives die and your server will still keep ticking. If you need more space, and have immediate access to the server to replace a failed drive, go with a four drive RAID 5 to get 27 gigs of space.

Edit:

In case anyone's wondering, my development server at home has 4 drives... two RAID 1s - 2x120 and 2x80. Every night, the important data from the larger RAID 1 synchronizes with the smaller RAID 1 (counting? that's 4 drives that the important data's on so far...). I then use rsync to synchronize my offsite copy (also on RAID 1 2x80 ... 6 drives). The CVS data also gets checked out to my laptop nightly, so my most important data is copied to 7 drives at any one time, two offsite and one on a laptop that I carry with me. And yes, out of this 7 drive setup, I have had up to 3 drives dead at once before I could replace any.
 

Zbox

Senior member
Aug 29, 2003
881
0
76
Originally posted by: n0cmonkey
Originally posted by: Zbox
Two SATA Raptors in a RAID 0 config ;)

For something like this, I'm guessing redundancy is a much better idea.

I don't really think the data is that crucial. A daily export of the stats database to another computer would be more than sufficient for all backup needs... The benefits of higher r/w speeds are better than redundancy in this situation, at least in my opinion... :D
 

GeoffS

Lifer
Oct 10, 1999
11,583
0
71
I'm really leaning towards SCSI for a couple of reasons... reliability & durability, price, availability, and size. I really think that a few 9g drives in RAID will offer me more than enough space (the whole mySQL data folder is just over 1g) so there's plenty of room to grow, and if I need more, I just add another drive. I can currently pick 2x 10k rpm WD U160 drives for $50 on eBay.


Originally posted by: compudog
Geoff, I have an Adaptec AHA-2940UW and two WD WDE9150 9 gig drives you can have if you like. They are running a dedicated DPAD machine right now, but I have a spare IDE I can put in the box. I don't have the drives in RAID, but I believe that card can do RAID.

Anyone correct me if I am wrong. Maybe only software RAID. Not sure.


This sounds like an excellent start! I'm certain the 2940 doesn't have hardware RAID, but these *are* 10K U160 drives... that should be sufficient speed, shouldn't it? :)

Geoff
 

Freewolf

Diamond Member
Feb 15, 2001
9,673
1
81
Originally posted by: GeoffS
I'm really leaning towards SCSI for a couple of reasons... reliability & durability, price, availability, and size. I really think that a few 9g drives in RAID will offer me more than enough space (the whole mySQL data folder is just over 1g) so there's plenty of room to grow, and if I need more, I just add another drive. I can currently pick 2x 10k rpm WD U160 drives for $50 on eBay.


Originally posted by: compudog
Geoff, I have an Adaptec AHA-2940UW and two WD WDE9150 9 gig drives you can have if you like. They are running a dedicated DPAD machine right now, but I have a spare IDE I can put in the box. I don't have the drives in RAID, but I believe that card can do RAID.

Anyone correct me if I am wrong. Maybe only software RAID. Not sure.


This sounds like an excellent start! I'm certain the 2940 doesn't have hardware RAID, but these *are* 10K U160 drives... that should be sufficient speed, shouldn't it? :)

Geoff

I would think so but then again I don't drive a sports car either.
 

ProviaFan

Lifer
Mar 17, 2001
14,993
1
0
Originally posted by: GeoffS
lol.... my minivan can almost exceed the speed limit! ;)
So can the Ford Tempo that my dad drives to work (and that I get stuck driving if I want to go somewhere). Hey, at least it keeps me legal. :D
 

n0cmonkey

Elite Member
Jun 10, 2001
42,936
1
0
Originally posted by: Zbox
Originally posted by: n0cmonkey
Originally posted by: Zbox
Two SATA Raptors in a RAID 0 config ;)

For something like this, I'm guessing redundancy is a much better idea.

I don't really think the data is that crucial. A daily export of the stats database to another computer would be more than sufficient for all backup needs... The benefits of higher r/w speeds are better than redundancy in this situation, at least in my opinion... :D

How do you measure the performance of GeoffS getting the call in the middle of the night that one of his drives failed? ;)

Redundancy is important.

Plus, if he does find out a drive failed, the minivan won't be pushed to its limits trying to bring the system back up if he goes with a real raid. He can let it chug along at its own pace while the system runs off the other disks. :p
 

ProviaFan

Lifer
Mar 17, 2001
14,993
1
0
Originally posted by: n0cmonkey
How do you measure the performance of GeoffS getting the call in the middle of the night that one of his drives failed? ;)

Redundancy is important.

Plus, if he does find out a drive failed, the minivan won't be pushed to its limits trying to bring the system back up if he goes with a real raid. He can let it chug along at its own pace while the system runs off the other disks. :p
LOL. :)

I think the issue here is that if you didn't care about reliability, you could just use one disk and not bother with RAID, but for something like this where (I presume) it is desired for the server to run with least hassle, then the implementation of a nonRedundant Array of Independent Disks level 0 that would decrease reliability to an even lower level than when running off a single drive, is probably not smart. ;)
 

n0cmonkey

Elite Member
Jun 10, 2001
42,936
1
0
Originally posted by: jliechty
Originally posted by: n0cmonkey
How do you measure the performance of GeoffS getting the call in the middle of the night that one of his drives failed? ;)

Redundancy is important.

Plus, if he does find out a drive failed, the minivan won't be pushed to its limits trying to bring the system back up if he goes with a real raid. He can let it chug along at its own pace while the system runs off the other disks. :p
LOL. :)

I think the issue here is that if you didn't care about reliability, you could just use one disk and not bother with RAID, but for something like this where (I presume) it is desired for the server to run with least hassle, then the implementation of a nonRedundant Array of Independent Disks level 0 that would decrease reliability to an even lower level than when running off a single drive, is probably not smart. ;)

I guess you could say it like that. ;)
 

RaySun2Be

Lifer
Oct 10, 1999
16,565
6
71
Haven't kept up with it, but isn't the throughput capabilites as important as the speed of the drives?

At one time, IIRC, not only was SCSI faster, but it could handle more throughput/bandwidth than IDE. I know that the IDE/SATA hard drives are fast now, but does the SATA form comparable to SCSI in throughput/bandwidth, or am I totally off in left field here?
 

Confused

Elite Member
Nov 13, 2000
14,166
0
0
Storage space WILL be an issue, Geoff. You are using 1gb of space since essentially March. 1gb in 3 months, and this is going to grow exponentially as you collect stats more often, and for more projects. 3x9GB SCSI drives in RAID5 will give you ~15gb to play with after Windows etc (swap/page files, installation), which is probably going to last you 18 months at the most (I reckon that could even be less, depending on what other projects you decide to track)

Something like this along with 3 of these in RAID5 will give you ~70gb of space, as well as speed and redundancy. Hell, even that controller board and 2 of the drives in a RAID1 will give you probably better speed than older 9gb SCSI drives in a RAID5.

If you do want to go the SCSI route, I must stress that newer drives are going to be better than older drives, even older 10k drives could well be slower than modern 7200rpm (S)ATA drives due to the enhancements in technology.

I think you need to price up each option you are looking at (SCSI RAID5 with 9gb, ATA with Raptors in RAID1 and 5 etc), and factor $/gb into the equation, and post back once you've figured out how much each could cost. I think you'll be surprised :)


Garry
 

GeoffS

Lifer
Oct 10, 1999
11,583
0
71
I'll take a look at that this weekend if I can find some time.

In looking at the database files, the largest (by far) are the files that won't grow much more than they already are... the nodes hourly file for D2OL is 1.4g (yeah... looks like my math was way off), but it will always be only 1.4g... for the hourly projects, the largest files are the ones that keep the rolling 48 hours of data... those too will generally not grow much larger than they already are. The entire mysql/data directory is only 2g.

btw... I was planning on RAIDing just the data and doing a periodic Ghost of the primary partition. I don't see that the primary partition will change all that often, as it will contain the OS and required software (php, Apache)... not sure if I can have the mysql executables on one drive and the data on another since the data is a subdir of the mysql dir... I suspect people will have opinions about this! ;)

Geoff
 

Confused

Elite Member
Nov 13, 2000
14,166
0
0
OK, I don't really know the setup. If you say that the database is set up so that it won't grow as the days go on (i'm guessing in this case you don't save the raw data from each update?) then i'll have to trust you that you won't need much extra space. In which case, 10/15k 9GB drives may well be fine for you :)

And as for the data directory, yes, I'm pretty certain you can change the data directory, it will be in one of the config files for mySQL :)


Garry
 

GeoffS

Lifer
Oct 10, 1999
11,583
0
71
The db will definately grow... I was surprised at the current size, and then realized that a large part of the current size was made up of tables that will not grow significantly in size (like the nodes hourly file... it is a list of all the current nodes... it shouldn't grow noticably unless there is suddenly a huge influx of users), and when I optimized the table just now, it went down from 1.4g to 48Mb (1 million records)... I've changed the hourly process to optimize that file after each run! :roll: Naturally, the user_daily, user_total, team_daily, and team_total files will grow every day since they contain a record for each user or team for each day. In D2OL, the project that's been running the longest, the team_daily and team_total files each contain around 102,000 records and take up a combined 14Mb of disk space.

I guess what I was trying to say about the space requirements is that 3x 36g drives in RAID 5 will probably only fill up some time after the current RC5 has finished! ;)

Thanks for all the great input guys... I will look into the relative costs and benefits! :)

Geoff