Need Help! Price estimate on server build

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

xSauronx

Lifer
Jul 14, 2000
19,582
4
81
Alrighty, this is totally last second dumped on me, as she's been away for more than a month and suddenly needs to finish this grant. Nothing unusual here, actually....

I'll have to look over some stuff at home. le sigh... :(

so you guys are saying $20-40k won't be able to run MySQL, for example, and handle some 20-40 TB of data?

:hmm:


for an idea...the community college i just graduated from took almost 20k to buy a non-major brand 12TB SAN (mirrored, 6TB usable), plus another 7k to buy a used 3750-24 10/100/1000 with 2 GBICs

we looked at something from hp and dell i think, a similar setup would have cost more like 60 or 70k (i dont remember too well, it was expensive) for a similar amount of storage.
 

zinfamous

No Lifer
Jul 12, 2006
111,864
31,359
146
It can be done on that price point but it's not going to be that great, probably awful... we need a more realistic sense of how fast it needs to be, how people are connecting to it, what your network is like, how much the data is going to grow over the next 3 years, large files/small files?, etc

well, we run 10gb switches on campus--I could be wrong? fuck if I know, lol. a single sequencing run produces about 15-20gb data. We sequence a lot of genomes, and assembly means taking 1-3 of these 20gb runs and analyzing them together.

Hell, I'll give you a call later. I'm gonna try and pry out some more info.
 
Feb 24, 2001
14,513
4
81
500GB or 500gb? Big difference.

As noted, even if only 500gb a week, you're well outside of a standard server for hosting, you'll need a SAN.

How much retention/redundancy? What about uptime? If you lose the equipment, how long can you be down before going out of business?

As noted, you really need to talk with enterprise or corp sales at HP, Dell, etc.
 

Platypus

Lifer
Apr 26, 2001
31,046
321
136
well, we run 10gb switches on campus--I could be wrong? fuck if I know, lol. a single sequencing run produces about 15-20gb data. We sequence a lot of genomes, and assembly means taking 1-3 of these 20gb runs and analyzing them together.

Hell, I'll give you a call later. I'm gonna try and pry out some more info.

You'll want a 10G NIC then, which is pricey in its own right. I can make recommendations for you on stuff I've worked with personally when you call later if you want. You should also think about a backup solution for this.. how are you going to recover 20TB quickly, what medium are you saving it on? How much of a retention do you need? etc
 

olds

Elite Member
Mar 3, 2000
50,124
779
126
Say what? That's like asking how high is up? What's the best car or, how many people does it take to screw in a corporate light bulb?...
I used to love it when I was out doing chain control or plowing. People would ask me: "how far down is it snowing?"
My answer: "all the way to the ground."

We now resume your regularly scheduled off topic programming.
 

Lifted

Diamond Member
Nov 30, 2004
5,748
2
0
How in the hell do you plan on building/administering this server/NAS/SAN if you don't know the first thing about the technology?

If you're in a uni, contact somebody in the IT department as they will know how to design a solution for you and know who offers the best bang for the buck based on their purchasing agreements with the major vendors. On items this large you can expect 20 - 40% off list for an edu contract.
 

zinfamous

No Lifer
Jul 12, 2006
111,864
31,359
146
Why are you storing full genome sequences instead of variants from the reference sequences?

because they are, well..."our" genomes? We're mostly working with previously un-sequenced critters, so everything we work with, we have to sequence. That's our data.

We have to assemble each genome, of course; mostly using TopHat, Soap...Bowtie, a few other programs developed locally. If any of this is familiar to you, perhaps you could give some tips on what would be needed in a server for multiple people to concurrently run our aligning software over our data? Anyway, I don't work with the analysis; I mostly prepare the libraries for sequencing.

Hehe, currently, our lab took up more Illumina time than any other on campus--we have our own Next Gen sequencing core facility here. Hell, Our closest collaborator has his own Solexa machine in his lab. :eek:
yeah, he's loaded.....
 

zinfamous

No Lifer
Jul 12, 2006
111,864
31,359
146
How in the hell do you plan on building/administering this server/NAS/SAN if you don't know the first thing about the technology?

If you're in a uni, contact somebody in the IT department as they will know how to design a solution for you and know who offers the best bang for the buck based on their purchasing agreements with the major vendors. On items this large you can expect 20 - 40% off list for an edu contract.

yeah, I guess I'm still not that clear. I'm looking for a general price estimate in order to write a grant. We aren't buying anything at the moment. It takes months for grants to be approved.

Again, this task was handed to me about 2 hours ago, out of nowhere. Our previous guy that would jump on this stuff is now somewhere else. :(

I'm just looking for a baseline price, so a few details on what I should be looking for...
 

Dark4ng3l

Diamond Member
Sep 17, 2000
5,061
1
0
If these things take months to approve then you should be fine taking a couple of days to look at options and consult the right people. As an accountant I can tell you that there is nothing worse than making random decisions or just throwing a random number at a project without some kind of real ides what the cost will be both short term and long term.

Tell your boss that you are not sure and can't tell her right now unless you guess and that you can give her a better answer Friday or something. If you don't know then just say you don't know but that you are going to find the solution and then do it.
 

mugs

Lifer
Apr 29, 2003
48,920
46
91
If you're getting by with a desktop PC as your server right now, I think it's safe to say that most of what has been mentioned in this thread is overkill.
 

ultimatebob

Lifer
Jul 1, 2001
25,134
2,450
126
If you're getting by with a desktop PC as your server right now, I think it's safe to say that most of what has been mentioned in this thread is overkill.

Yeah... but I'm sure that they're not trying to store 40 TB of data on that old desktop, either.

For that kind of storage, you really need both a server and a SAN. Sure, he could probably daisy chain a few Drobo's to that desktop to get that kind of storage, but the performance would suck.
 

Gigantopithecus

Diamond Member
Dec 14, 2004
7,664
0
71
because they are, well..."our" genomes? We're mostly working with previously un-sequenced critters, so everything we work with, we have to sequence. That's our data.

We have to assemble each genome, of course; mostly using TopHat, Soap...Bowtie, a few other programs developed locally. If any of this is familiar to you, perhaps you could give some tips on what would be needed in a server for multiple people to concurrently run our aligning software over our data? Anyway, I don't work with the analysis; I mostly prepare the libraries for sequencing.

Hehe, currently, our lab took up more Illumina time than any other on campus--we have our own Next Gen sequencing core facility here. Hell, Our closest collaborator has his own Solexa machine in his lab. :eek:
yeah, he's loaded.....

Ahh, even if every single organism you're sequencing has never been sequenced, it still has relatives, and you can still use indices. And I doubt your PI works on entirely disparate branches from the tree of life, probably on a bunch of relatively related organisms - IOW your own lab should be building its own indices.

You're also not generating 500gb of data per week that actually needs long term storage unless your lab is sequencing the equivalent of 100 human genomes and annotating them...every week. Once those scaffolds are assembled, they're trashed.
 

MarkXIX

Platinum Member
Jan 3, 2010
2,642
1
71
Yep, you need a SAN.

I would find a solution that allows you to start small and build out though. There are a lot of smaller storage vendors with solutions out there. They usually advertise in IT related magazines and as utilitarian as storage has become, the barrier to entry is getting pretty low.
 

zinfamous

No Lifer
Jul 12, 2006
111,864
31,359
146
well, sorry for all the formatting, but this is what I put together through our campus IT account service, University pricing and such. (Thanks Plat, for reminding me :p)
Yeah, I had to cut and paste, cause I have no idea what's going on here...but it sounds like it's 90% close to what we need. Consulting and ironing out the details can come later, being that it will be several months before we here about the grant, and know what there is to spend.

Thanks AT, I actually know a little bit more about this stuff now than I did 6 hours ago (though I'm sure it doesn't show :hmm:)

md3000_121x107.jpg
MD3000 disk storage array
spacer.gif

Qty 1 Configured with two single-port controllers
PowerVault MD3000 --Primary Hard Drive Ten 1TB 7.2K RPM Universal SATA 3Gbps --Server connectivity SAS 5/E HBA, PCI-Express, 2x4 connectors
--5x 500GB 7.2K RPM Universal SATA 3Gbps 3.5-in HotPlug Hard Drive
(12.5 TB is actually 3x our current read usage)
--300GB 15K RPM Serial-Attach SCSI 3Gbps 3.5-in HotPlug Hard Drive, Cust. Kit
TOTAL: $13,139.39
server-poweredge-r610-120x107.jpg

R610 1U 2-socket standard server
Qty 1 Chassis for Up to Six 2.5-Inch Hard Drives and Intel® 56XX Processors, Windows Server 2008 R2, Enterprise Academic Edition,x64, Includes 25 CALs

Unit Price $19,807.76



--Operating System Windows Server 2008 R2, Enterprise Academic Edition,x64, Includes 25 CALs --96GB Memory (12x8GB), 1333MHz Dual Ranked RDIMMs for 2 Processors, Optimized --Dual Two-Port Embedded Broadcom® NetXtreme II 5709 Gigabit Ethernet NIC --2x Intel® Xeon® X5677, 3.46Ghz, 12M Cache,Turbo, HT, 1333MHz Max Mem
--1st Hard Drive 600GB 10K RPM Serial-Attach SCSI 6Gbps 2.5in Hotplug Hard Drive --Primary Controller PERC 6/i SAS RAID Controller, 2x4 Connectors, Internal, PCIe, 256MB Cache --Network Adapter Broadcom 57710 10GbE Single Port 10GbE NIC, Copper, PCIe-8 -- 5x 600GB 10K RPM Serial-Attach SCSI 6Gbps 2.5in Hotplug Hard Drive --Hard Drive Configuration RAID 10 for H700 or PERC 6/i Controllers
--Power Supply High Output Power Supply, Non-Redundant, 717W --Host Bus Adapater/Converged Network Adapter Qlogic QLE8152 10Gb CNA/Fibre Channel over Ethernet Adapter
TOTAL: $19,807.76
spacer.gif
spacer.gif
Total Price $32,947.15
spacer.gif
 

zinfamous

No Lifer
Jul 12, 2006
111,864
31,359
146
Ahh, even if every single organism you're sequencing has never been sequenced, it still has relatives, and you can still use indices. And I doubt your PI works on entirely disparate branches from the tree of life, probably on a bunch of relatively related organisms - IOW your own lab should be building its own indices.

You're also not generating 500gb of data per week that actually needs long term storage unless your lab is sequencing the equivalent of 100 human genomes and annotating them...every week. Once those scaffolds are assembled, they're trashed.

yes, we have our common outgroups. we're interested in critters with funky sex chromosomes, so we not only do we need the various configurations of sex chromosomes out there, but a decent portion of the autosomes so that we can track selection. --many of this have newly-evolving x chromosomes (or z, or whatever), so we can track selection from autosomes to the sex chromosomes (well, that's the hope).

I honestly don't know how much is tossed after it's processed and assembled, sure--there's plenty of junk that gets tossed once the reads come off the machine before you even start assembly. But after that, depending on who is doing what, I really couldn't say what we need to keep temporarily or long-term as projects tend to take strange directions throughout their lifespan.

But again, I'm a molecular kind of guy. This stuff is really not my bag. :\
 

ultimatebob

Lifer
Jul 1, 2001
25,134
2,450
126
96 GB of memory?!? Holy hell. Why do you need that much RAM for a single database server? I've built VMWare host systems running 8 servers each that have less total memory than that.
 

Elbryn

Golden Member
Sep 30, 2000
1,213
0
0
random thoughts while reading the thread. what's your i/o profile going to look like? that'll drive the decision on what kind of disks you need.

why 5 500gb disks and 10 1tb disks? cost consideration? you wont be able to create a single raid group out of the set as raid will use the smallest disk in the array to calculate size.

is 10gb nic really necessary? do you have enough machines that will be sending data concurrently to demand that sort of pipe at the same time? if you do, then you may really want to go back to the first question and quantify your i/o needs because i dont think your backend sata disk is going to keep up with a 10gb pipe filling it. decide what you need performance or space.

you also got some 10k sas drives in that server. going multiple tier storage? your 10k is going to be your higher performing disk, the md3000 with sata is long term?

Plan your raid- raid levels will reduce your total disk amount. dont plan on raw data size, plan with size after raid creation. higher protection in raid's eat up drives. raid 5 gets you n-1 times drive size in usable space. whereas raid 10 will eat up considerable more, mirroring and striping.

that single server you added is bringing quite a bit of memory. is it going to be running calc's and apps in addition to being storage?

who's gonna be supporting the rig? if its you or someone in the lab, i'd get the simplest solution that meets the requirements as possible. a tower server with as many 1-2tb drives as you can fit into it and acting as purely storage is the easiest route. that same tower can be upgraded to 2 quad core procs and a boatload of ram to also run calcs.

my suggestion is to make it simple and standalone unless you have support to take care of a more complicated setup.