• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Question 4 Petabyte SSD NAS Storage

fionaforeni

Junior Member
Hi friends!

Due to our storage needs, we need 4 Petabytes (not petabits) of SSD of storage (2 PB useable), with obvious "fast" NIC / CPU attached. These will be attached to x4 NVIDIA DGX-A100. Requirements is availability and low power consumption. This is for an academic university lab. Should we use a local server builder, or should we do it ourselves? We're risk takers and tech heavy, but no experience in building such a large storage arrays. It's cheaper in the long run for us to host our own GPUs than to rely on the cloud. This is for Toronto, Canada.

If we build it ourselves, my understanding is that we should use mirrored vdev (no raid), what other OS / software / tips / recommendations do you have?

Thanks!
 
4PB's is going to be costly....

I'm running 16TB / Raid 10 on spinners using Linux as the host and no need for the raid controllers as they're all tied to the MOBO. Even using HBA's to get more ports would not require a controller card.

With this sort of data / speed w/ SSD your NIC solution depends on the rest of the network capabilities as you can go with a 100GEx2 ports ~$1100 per card.

I don't know which drives you're planning on using that can get you to PB range as most of the drives in a flash format I've seen top out at 32TB IIRC.

This type of setup is going to be very hot and need some serious cooling to keep it running at speed / stable.

It's likely to take a couple of chassis to mount all of the drives in and interconnect them.

What you're talking about doing is more along the lines of a SAN rather than a NAS due to the scale. https://hypertecdirect.com/knowledge-base/nas-vs-san/

You can structure things either way though.

 
Dude....no way you should be hacking together a 4PB array by yourself. That's needs custom software and support to manage properly. I'd suggest Dell EMC and I'd suggest a hybrid storage solution of flash/spinning rust drives. You're looking at 6 figures done properly.
 
This is for an academic university lab.
6 figures done properly.
Completely agree with this at this point just looking at the drive costs alone it's not cheap by any means but, it's all relative to the budget expectations as well which aren't mentioned.

I think stacking clusters together is probably going to be the best option rather than a single rack but, w/o more info it's hard to say. I've been around those EMC/Dell units the size of a small car before and they're impressive. Higher caliber than a student lab would be using in most cases.
 
You're looking at 6 figures done properly.
+1
And were not talking low, but mid to upper 6 figures.

@ OP Your not going to get an OK on your budget / planning from the controller when you tell him you want to drop that much to buy parts to mess around and possibly still not get working properly.

Even trying to source the parts though vendors will be difficult. Its not something you can pop over to Newegg or Amazon... because if you think your gonna find that specialized parts on amazon, refer to my last sentence in this post.
I don't think you'll be able to source all of it even at CDW, which means you will most likely need to go to the source direct.

OP, if you came to ask the questions your asking now... stop.
If you had any remote knowledge and skill to set up this machine, you would of just done it, and not come on the forums asking for advice.

I stand by pinky's statement... get it done properly through dell / hp / lenovo / Supermicro direct with a full enterprise grade warranty included.
 
Last edited:
For connection to DGX, I'd look at Vast, DDN, or PowerScale's F600/F900 nodes. These will all have a cost that may not be within budget, but are robust solutions backed by a support team for when stuff breaks. They're also fairly turnkey so they will have a quick time-to-value. All support 100Gb/s ethernet and/or Infiniband. Most importantly, they all have been tested and have solutions to run in conjunction with DGX.
 
Back
Top