Linux Server config?

crisscross

Golden Member
Apr 29, 2001
1,598
0
71
I need a rig to crawl websites and pull data continously, It will be running on Linux and will probably have MySql, Java, perl scripts... I am not sure what to get? can you guys help me out?

I need a reliable system.. one which will be running 24x7, and store huge amounts of data.. Please let me know what CPU/Mobo combination I should get along with RAM and Hard Drive..

Thanks!
 

jvarszegi

Senior member
Aug 9, 2004
721
0
0
How can anyone size this thing without specifics? I mean, an 80 GB disk is huge to some people. Also, you don't state your budget.
 

crisscross

Golden Member
Apr 29, 2001
1,598
0
71
Hi thanks for the reply. Here are the rough specs:
1. Processor- No clue about the processor i should get for my application?
2. Storage- I need at least 400GB.. do i get multiple 160GB's and hook them up on RAID or get 2 large 400GB drive's?
3. RAM-Again dependant on the above.
4. Video-Graphics, don't care about either one as this will only be a server, same goes with the display.

Budget... I would like to keep it around $1000-$1500.

Thanks!
 

Rhin0

Senior member
Nov 15, 2004
967
0
0
Just out of curiosity... What kind of data would you "continously be pulling" from websites?
 

crisscross

Golden Member
Apr 29, 2001
1,598
0
71
Well it will be a mini search engine of sorts, building a database of websites by crawling search engines for various keywords and maybe even downloading the entire site.
 

jvarszegi

Senior member
Aug 9, 2004
721
0
0
Originally posted by: crisscross
Hi thanks for the reply. Here are the rough specs:
1. Processor- No clue about the processor i should get for my application?
2. Storage- I need at least 400GB.. do i get multiple 160GB's and hook them up on RAID or get 2 large 400GB drive's?
3. RAM-Again dependant on the above.
4. Video-Graphics, don't care about either one as this will only be a server, same goes with the display.

Budget... I would like to keep it around $1000-$1500.

Thanks!

1. How processor-intensive are your applications? If they'll be doing a lot of data munging, go for a 2-way hyperthreaded Xeon system for best performance (crazy stuff like eight-way Opterons are out of your budget). Max of $1500 means you'd have to build it yourself, and even then a decent system will run a tad more.

If your app(s) will mostly capture data, i.e. will be mostly I/O bound, almost any ol' processor will do to run dozens of threads. If that's the case, I'd go for a P4 or A64, at the highest rating you can afford. You could probably pack the whole thing into a SFF case, since your storage requirements are pretty modest.

2. It sounds to me like for best performance you need a system that will support multiple concurrent reads and writes well. Look into partitioning your application, and if that's an option, dedicate one disk subsystem to each partitioned application for best performance. If thats NOT an option, I'd usually go for multiple small fast disks in RAID 5, which also gives you redundancy in case a disk fails (don't know if you need that or not). Get a decent controller for your disks, and stay away from software RAID if possible. Go for SCSI drives if you have enough dough and your application uses the disk heavily.

For RAID 5, the minimum number of 160GB disks you'll need is 4, in order to get 400GB of usable space (you'll actually get around 480GB). If you don't care about data redundancy, you'd likely get pretty good performance from three SATA 160 GB disks in RAID 0.

3. Speed of RAM won't be important for your application. If your application makes extensive use of caching, just get the maximum amount of RAM you can afford that doesn't over-satisfy the requirements of your app by a ridiculous amount. In other words, if your app will use a known 500MB peak, you'd waste money by getting 4GB of RAM. Get value RAM of the type suitable for your hardware.

 

crisscross

Golden Member
Apr 29, 2001
1,598
0
71
Thanks for this, I'm inclined to go for A64, probably 1GB RAM, I will require some data redundancy, which motherboard wud you recommend, does RAID come with the MB or does one have to buy that separately?
 

jvarszegi

Senior member
Aug 9, 2004
721
0
0
Well, the nForce4 chipset comes with dual gigabit LAN, so that sounds like it might be great for your application. I chose A8N-SLI Deluxe for my new motherboard, but you probably don't need one that's so expensive; I mean, we definitely know that you don't need SLI! :) nForce4 comes with built-in RAID abilities, and my motherboard also has an extra on-board RAID controller to give even more options. Absolute best performance would be given by a dedicated SCSI RAID controller, but you'll pay extra for storage; I'll bet that with a little shopping you could make that fit into a $1500 budget. I'd probably go with the built-in RAID support of nForce4 if I were you.

Introduction to nForce4
Review of an nForce4 motherboard from Gigabyte