• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Linux Server config?

crisscross

Golden Member
I need a rig to crawl websites and pull data continously, It will be running on Linux and will probably have MySql, Java, perl scripts... I am not sure what to get? can you guys help me out?

I need a reliable system.. one which will be running 24x7, and store huge amounts of data.. Please let me know what CPU/Mobo combination I should get along with RAM and Hard Drive..

Thanks!
 
How can anyone size this thing without specifics? I mean, an 80 GB disk is huge to some people. Also, you don't state your budget.
 
Hi thanks for the reply. Here are the rough specs:
1. Processor- No clue about the processor i should get for my application?
2. Storage- I need at least 400GB.. do i get multiple 160GB's and hook them up on RAID or get 2 large 400GB drive's?
3. RAM-Again dependant on the above.
4. Video-Graphics, don't care about either one as this will only be a server, same goes with the display.

Budget... I would like to keep it around $1000-$1500.

Thanks!
 
Well it will be a mini search engine of sorts, building a database of websites by crawling search engines for various keywords and maybe even downloading the entire site.
 
Originally posted by: crisscross
Hi thanks for the reply. Here are the rough specs:
1. Processor- No clue about the processor i should get for my application?
2. Storage- I need at least 400GB.. do i get multiple 160GB's and hook them up on RAID or get 2 large 400GB drive's?
3. RAM-Again dependant on the above.
4. Video-Graphics, don't care about either one as this will only be a server, same goes with the display.

Budget... I would like to keep it around $1000-$1500.

Thanks!

1. How processor-intensive are your applications? If they'll be doing a lot of data munging, go for a 2-way hyperthreaded Xeon system for best performance (crazy stuff like eight-way Opterons are out of your budget). Max of $1500 means you'd have to build it yourself, and even then a decent system will run a tad more.

If your app(s) will mostly capture data, i.e. will be mostly I/O bound, almost any ol' processor will do to run dozens of threads. If that's the case, I'd go for a P4 or A64, at the highest rating you can afford. You could probably pack the whole thing into a SFF case, since your storage requirements are pretty modest.

2. It sounds to me like for best performance you need a system that will support multiple concurrent reads and writes well. Look into partitioning your application, and if that's an option, dedicate one disk subsystem to each partitioned application for best performance. If thats NOT an option, I'd usually go for multiple small fast disks in RAID 5, which also gives you redundancy in case a disk fails (don't know if you need that or not). Get a decent controller for your disks, and stay away from software RAID if possible. Go for SCSI drives if you have enough dough and your application uses the disk heavily.

For RAID 5, the minimum number of 160GB disks you'll need is 4, in order to get 400GB of usable space (you'll actually get around 480GB). If you don't care about data redundancy, you'd likely get pretty good performance from three SATA 160 GB disks in RAID 0.

3. Speed of RAM won't be important for your application. If your application makes extensive use of caching, just get the maximum amount of RAM you can afford that doesn't over-satisfy the requirements of your app by a ridiculous amount. In other words, if your app will use a known 500MB peak, you'd waste money by getting 4GB of RAM. Get value RAM of the type suitable for your hardware.

 
Thanks for this, I'm inclined to go for A64, probably 1GB RAM, I will require some data redundancy, which motherboard wud you recommend, does RAID come with the MB or does one have to buy that separately?
 
Well, the nForce4 chipset comes with dual gigabit LAN, so that sounds like it might be great for your application. I chose A8N-SLI Deluxe for my new motherboard, but you probably don't need one that's so expensive; I mean, we definitely know that you don't need SLI! 🙂 nForce4 comes with built-in RAID abilities, and my motherboard also has an extra on-board RAID controller to give even more options. Absolute best performance would be given by a dedicated SCSI RAID controller, but you'll pay extra for storage; I'll bet that with a little shopping you could make that fit into a $1500 budget. I'd probably go with the built-in RAID support of nForce4 if I were you.

Introduction to nForce4
Review of an nForce4 motherboard from Gigabyte
 
Back
Top