Why are 15K HDDs not dead?

etherealfocus · Oct 26, 2016

http://www.anandtech.com/show/10726...rprise-performance-15k-hdds-with-nand-caching

Is it just compatibility with existing datacenter infrastructure?

-2M hrs MTBF matches the 850 Pro and isn't that much higher than enthusiast SSDs like the 600p and 950 Pro. TBW numbers also seem roughly comparable.

-SSD has massive advantages in performance, heat, power consumption, and lack of vibration.

-$/GB probably favors the HDD, but 1TB SSDs just aren't that expensive anymore; a 1TB 850 Pro is only $430 on Amazon and has 10% more usable capacity than the 900GB 15k drive in the article... which I'd imagine goes for $300ish minimum. What datacenter wouldn't pay a bit extra for improved density and performance, and reduced wear and heat?

CiPHER · Oct 27, 2016

Harddrives are proven devices. They are much simpler than SSDs which have like their own operating system. This is good for several things, but also makes it prone to obscure bugs.

Harddrives thus are good for conservative users who want reliability and compatibility. The 15k ones still do sufficiently well for many tasks, even though SSDs go to the roof in terms of IOps performance. But not everyone needs 70.000 IOps. Reliability and compatibility with existing software is more important.

For example, consumer Samsung SSDs are basically incompatible with RAID or complex storage. They go back in time, which can lead to unforeseen consequences for server-like setups. With harddrives you buy something which is much simpler, so that the system administrator actually knows what is going on. SSD firmware is obscure and too complex to effectively know what is going on at all times.

Other than that, all mechanical storage is dead, with only low-rpm harddrives suitable for mass-storage for another decade or so. The high-rpm drives are becoming a niche rapidly, mainly for conservative clients who do not want to swap their failed HDDs with SSDs but prefer 'tried and proven' technology over the cutting-edge SSDs.

PliotronX · Oct 27, 2016

The Seagate 15k drives cook themselves so I would only draw the conclusion that they're proven to fail. The umpteenth generation SSD controllers have damn good reliability but I get what you're saying. Just as the cold dead hands hold onto XP, some are just so used to them. I don't see the point of them personally. A tiered storage configuration with large reliable 7200 drives will make all the drives last longer with speed benefits the 15k cannot match.

Torn Mind · Oct 27, 2016

Recovery should be more reliable on spinners than SSD. It also should be less expensive. Before you say, well, make backups, recovery companies somehow manage to stay in business after all these years. Because sometimes, that busted disk was indeed the only disk that had that vital data.

dtgoodwin · Oct 27, 2016

I think much of it has to do with how servers are operated and how storage in general in configured. Most corporations of any size are using SAN's for storage aside from the OS' on a given server. These SANs are already tiered with different levels of capacity/performance. I can guarantee that an SSD in that system isn't double or triple a 15k drive, it's far more than that, and the options aren't to go to BestBuy, or order a Samsung from NewEgg, but they are drives unique to the platform. The same is even true with local storage for servers. No admin will put in their own drives and break whatever warranty they have on them. The SSDs are purchased and supported by the vendor. Again, not at a price that's close to 15k. Spinning drives are plenty good enough for the OS and tasks like print server queues. Who cares if a server can boot in 20 seconds when it takes 5 minutes to post?

etherealfocus · Oct 27, 2016

Cipher - we wouldn't be comparing enterprise 15k HDDs to consumer SSDs. Enterprise buyers would be looking at the Intel DC P3700 or similar.

Tried and true logic may be a factor, but I'd almost cringe at the thought.

a) Vendors like Intel are more than happy to provide heaps of case studies, on-site consultations, etc to make a deal on thousands of high-margin drives.

b) SSDs are used in plenty of datacenters from Rackspace to Amazon to Granny Web Host. I get that you might need HDDs for very specialized stuff like custom enterprise databases where the system is heavily optimized for drive-specific characteristics... but aside from that?

Pliotron - agreed. There's just zero advantage aside from a bit lower cost per gig... but what enterprise-grade customer prioritizes cost per gig over density, heat, power, and performance?!

Torn - who on earth buys a stupid-expensive HDD for better chance of recovery before buying a RAID and/or offsite backups? SMBs will use the 3-2-1 rule; enterprises will do something custom or contract out to a vendor (who's likely using 7200 platters).

dtgoodwin - SAN vendors are happily selling SSD products. Take this Dell unit for example: http://www.dell.com/us/business/p/equallogic-ps6000s/pd
Note that it specifically targets companies looking to upgrade from 15k spinners.

Also I'm pretty sure precisely zero people are buying 15k drives for their print servers. And not many more for OS drives.

The fact that you can't buy them without calling a SGT rep says they're only meant for bulk orders mostly from midbiz and enterprise. Aside from replacing failing units in an existing infrastructure, I just can't see a market. Anyone needing 15k level performance probably won't blink at coughing up a bit more for major wins in every important metric except $/GB. Remember, at the bulk level direct drive cost is less important than cost over time, density (cutting your rack count alone can save a pile of money), etc. And yeah, anyone filling racks has a backup solution. And if they don't, I'm quite sure they aren't buying spinners on the off chance they need to pull a drive and run System Recovery on it.

PliotronX · Oct 27, 2016

I guess there is one benefit: in the impossible yet entirely too probable scenario that a server has failing backups and is hit with ransomware it would take longer to corrupt the data

imported_ats · Oct 27, 2016

The market for 15k drives is pretty much legacy systems at this point or should be. For any new systems, companies would be basically stupid to buy rust for T0/T1 storage. For anything I/O dependent, SSDs are already a fraction of the cost and for anything capacity dependent, 15k drive cannot compete with 3.5" drives.

Red Squirrel · Oct 28, 2016

Life may be a big factor too, if you need a lot of IOPS it also means yo are writing them hard (most likely) such as a DB or VM environment. With SSDs you'll wear them out in a matter of a few years if they get constant IO. With spinners your failure is based mostly on randomness than a linear path to death. So often that can exceed SSDs in high IO applications.

On the other hand most businesses replace all their hardware once the warranty is out which is like 3 years (I always found that was a huge waste but I see why they do it) so perhaps the SSD wearing is not that big of an issue anyway. You can also track that through SMART, so really they could just be treated like printer cartridges and replace them as they get low. With raid it would be fairly seamless.

fleshconsumed · Oct 28, 2016

dtgoodwin said:
I think much of it has to do with how servers are operated and how storage in general in configured. Most corporations of any size are using SAN's for storage aside from the OS' on a given server. These SANs are already tiered with different levels of capacity/performance. I can guarantee that an SSD in that system isn't double or triple a 15k drive, it's far more than that, and the options aren't to go to BestBuy, or order a Samsung from NewEgg, but they are drives unique to the platform. The same is even true with local storage for servers. No admin will put in their own drives and break whatever warranty they have on them. The SSDs are purchased and supported by the vendor. Again, not at a price that's close to 15k. Spinning drives are plenty good enough for the OS and tasks like print server queues. Who cares if a server can boot in 20 seconds when it takes 5 minutes to post?

Like this guys said, it's because of enterprise server market that a) charges an arm and a leg for "enterprise" SSD and b) refuses to warranty hardware if you use unapproved i.e. your own hard drives. We went through this a year ago. HP point blank refused to warranty their servers if we used Intel DC S3700 drives, and their own HP labeled approved SSDs were 4x more expensive than intel's pushing the price from semi-reasonable 12-15K per server to almost 50K. Our dreams of cheap and affordable SSD powered servers were shattered because we didn't have 50K budget for a single server and because we did not want to lose the warranty on the rest of the hardware. The funny thing is we did get a few of those Intel SSDs in just to try and they worked flawlessly in HP box. I do believe that both Dell and HP relabel these Intel drives and resell them as their own at 4x the price. Disgusting, but that's how the world works.

[DHT]Osiris · Oct 28, 2016

fleshconsumed said:
Like this guys said, it's because of enterprise server market that a) charges an arm and a leg for "enterprise" SSD and b) refuses to warranty hardware if you use unapproved i.e. your own hard drives. We went through this a year ago. HP point blank refused to warranty their servers if we used Intel DC S3700 drives, and their own HP labeled approved SSDs were 4x more expensive than intel's pushing the price from semi-reasonable 12-15K per server to almost 50K. Our dreams of cheap and affordable SSD powered servers were shattered because we didn't have 50K budget for a single server and because we did not want to lose the warranty on the rest of the hardware. The funny thing is we did get a few of those Intel SSDs in just to try and they worked flawlessly in HP box. I do believe that both Dell and HP relabel these Intel drives and resell them as their own at 4x the price. Disgusting, but that's how the world works.

All this. We have a guy that ordered a full SSD server (IO intensive purposes, plus he had money to burn), was probably 10x as expensive as a platter version. All about the $$$$ right now.

etherealfocus · Oct 28, 2016

Why the focus on HP servers over, say, Supermicro through serversdirect.com or similar? Lots of companies will be happy to service third party hardware (local companies here that do that include Prototype IT and Network Elites). If you're on the SMB side and your IT fits in a single rack, it's pretty economical to either hire a contractor or have someone handle it in-house - especially compared to HP prices. Spending a bit of the savings on some extra hardware for redundancy is a no-brainer IMHO.

I have less experience on the enterprise side. Can Supermicro etc builds not be scaled competitively?

dtgoodwin · Oct 28, 2016

etherealfocus said:
Why the focus on HP servers over, say, Supermicro through serversdirect.com or similar? Lots of companies will be happy to service third party hardware (local companies here that do that include Prototype IT and Network Elites). If you're on the SMB side and your IT fits in a single rack, it's pretty economical to either hire a contractor or have someone handle it in-house - especially compared to HP prices. Spending a bit of the savings on some extra hardware for redundancy is a no-brainer IMHO.

I have less experience on the enterprise side. Can Supermicro etc builds not be scaled competitively?

When I was responsible for calling for repairs on our x86 servers, we had a 2-hour initial response window with IBM, guaranteed same day visit, and a guarantee on replacement parts within 24 hours (most were on the truck with the tech). You won't find that outside of tier 1 vendors unless you are a small shop providing your own support and have plenty of replacement parts on hand.

fleshconsumed · Oct 28, 2016

dtgoodwin said:
etherealfocus said:

Why the focus on HP servers over, say, Supermicro through serversdirect.com or similar? Lots of companies will be happy to service third party hardware (local companies here that do that include Prototype IT and Network Elites). If you're on the SMB side and your IT fits in a single rack, it's pretty economical to either hire a contractor or have someone handle it in-house - especially compared to HP prices. Spending a bit of the savings on some extra hardware for redundancy is a no-brainer IMHO.

I have less experience on the enterprise side. Can Supermicro etc builds not be scaled competitively?

Click to expand...

When I was responsible for calling for repairs on our x86 servers, we had a 2-hour initial response window with IBM, guaranteed same day visit, and a guarantee on replacement parts within 24 hours (most were on the truck with the tech). You won't find that outside of tier 1 vendors unless you are a small shop providing your own support and have plenty of replacement parts on hand.

Yeah, you absolutely can save money by doing custom build. Most of the time you can build your own server that's going to be faster and cheaper than HP/Dell/IBM. However, you do lose the support. That means you either have to be OK with potential downtime or you need to be committed to having spare parts on hand and first rate IT team capable of quickly troubleshooting and fixing the problem. The bigger the company is the less willing it's going to tolerate downtime, everything is going to be put behind 10 layers of bureaucracy, and no manager will want to be held responsible for 48 hours of downtime because they tried to save some money by going custom.

imported_ats · Oct 28, 2016

dtgoodwin said:
When I was responsible for calling for repairs on our x86 servers, we had a 2-hour initial response window with IBM, guaranteed same day visit, and a guarantee on replacement parts within 24 hours (most were on the truck with the tech). You won't find that outside of tier 1 vendors unless you are a small shop providing your own support and have plenty of replacement parts on hand.

The scary part is that it is lots of times cheaper to buy two quality whitebox/supermicro boxes and just spare one than buy through a T1 and pay their support costs.

In college, it was cheaper for the university to self warrantee hardware and just buy a room of spares than to pay for actual support. The only things they actually payed for support on was super exotic hardware like fully integrated supercomputers.

thecoolnessrune · Oct 28, 2016

I've seen most of the gamut in our customers. With HPC customers, we see a *ton* of King Star Computers (a Supermicro Systems Integrator) systems. They buy the systems, throw them in the rack, and start running workloads. All they care about is a service life warranty (3-5 years). If it takes a dump during that time, the server is off-lined for repair by a tech.

With many of our other customers, they're deep in gear often of the UCS Variety. 4 hour part delivery with field engineers. Several of our NetApp / EMC / IBM storage customers have 2 hour part delivery with field engineer.

Especially with Storage, the Support costs are structured so that after year 3, it's cheaper to get a new system than re-up support. That's how they keep their income going, and prevent having to support ancient systems.

I saw the invoice for one of our clients registering a new NetApp FAS cluster with 2 hour Hardware and SW Support. The 3 year support costs were 3 Million dollars!

Enterprise storage is $$$, and can't be thought about the same way you think about getting some drives from Newegg.

XavierMace · Oct 28, 2016

fleshconsumed said:
Yeah, you absolutely can save money by doing custom build. Most of the time you can build your own server that's going to be faster and cheaper than HP/Dell/IBM. However, you do lose the support. That means you either have to be OK with potential downtime or you need to be committed to having spare parts on hand and first rate IT team capable of quickly troubleshooting and fixing the problem. The bigger the company is the less willing it's going to tolerate downtime, everything is going to be put behind 10 layers of bureaucracy, and no manager will want to be held responsible for 48 hours of downtime because they tried to save some money by going custom.

Pretty much THIS. I love Supermicro, I've got 3 Supermicro boxes at the house currently. But there's not a chance in hell of my employer buying Supermicro because they want support that will have a part onsite in 4 hours, not 4 days. We're in the banking industry, days of downtime waiting for a part to arrive isn't an option. Having spare parts onsite sounds awesome in theory, it's usually a mess in practice. We run UCS/NetApp on our side, most of our clients run HP.

Red Squirrel · Oct 28, 2016

Never understood myself why enterprise overpay... I mean, I do understand, and I do know why they do it, but I just don't get WHY, I don't know if that makes sense. Basically, I think the reasons are stupid, to consider the HIGH costs, vs whitebox/Supermicro and simply overbuilding that system + having spare parts.

A typical enterprise will use enterprise grade gear that costs $$$ and will not have spares, and they will want to replace it all when the 3-5 year warranty is over! So spend these millions for something that is only going to be around for 5 years at most. Then rinse and repeat. Yeah there are perks like 4 hour support time frame etc. But they could save so much money if they simply went white box, in house, and simply had more redundancy so they can afford for one box to be down for a week if it has to be. Then order parts, or keep spares etc. Basically, shift IT off of having to do tickets, to actually having to support and build systems. The silly user ticket stuff can be shifted 100% to help desk.

Of course there is the whole blame game, by going with an OEM they can shift blame. But this is just a silly mentality thing. How about stop it with the whole blame thing and stop acting like kids, and just deal with problems as they happen and move on.

Also big head honchos tend to have the odd idea that enterprise gear never fails. Oh, it does. When it does, it's not pretty. You are now dead in the water waiting on someone else to hopefully figure it out. A self built system will be more understood and easier to troubleshoot.

Way I like to see it, everything can fail, so get the most reasonable quality + price combo you can get,and build as much redundancy as you can and have cold spares too. Better than paying an arm and a leg for something just because it has support, but have less redundancy and no spares.

Personally at home I tend to go Supermicro and it's been rock solid. I use to pure white box using consumer parts but I find Supermicro is the happy medium, it's "enterprisy" without paying an arm and a leg. One of my servers is a 24 bay with redundant PSU.

etherealfocus · Oct 28, 2016

Yeah that's pretty much where I was going. If you use commodity hardware, setting up a few machines as hot spares (even throwing one in a colo somewhere in case sh!t really goes south) is a pittance compared to an IBM contract.

I deployed a build last year using 3x minimalist servers: Xeon E3, 32GB RAM, 250GB boot/app SSD, 1TB data SSD. No RAID, no RPSU, no nonsense. Put two at the primary site and another at the client's other office. Cost well under 15k all told. A single machine was plenty, so they had an onsite hot spare and an offsite hot spare. Kept a 4th spare machine in the supply closet just in case.

It's certainly possible they could've all failed at once, but it most likely would've been a software issue where the client would've been on their own anyway.

In theory, I don't see why a similar philosophy couldn't be applied at the rack or blade level...?

Red Squirrel · Oct 28, 2016

I would at least do raid mind you, but it also depends how important uptime is. As long as you have spares on hand you're good to go. Redundant PSU is nice too but personally I reserve that for the machines that are in themselves single points of failure such as the storage server. One is plugged into the UPS and the other in the wall. In a dual conversion setup I'd have one plug into two separate inverters. It's not so much in case a PSU fails (that is probably just as rare as a mobo, ram etc) but in case a power source fails. (UPS/inverter dies etc).

Funny thing is I've seen super expensive enterprise equipment and then really crappy power infrastructure. No redundant PSU, no redundant power paths etc. So it kinda defeats the purpose lol.

The best is people that get redundant PSU and then use a Y cable.

XavierMace · Oct 28, 2016

Spoken like somebody who's never managed a large geographically diverse network.

Literally buying two (or more) of everything is going to eat through your savings in a heartbeat and that doesn't factor in the manpower aspect. Take for example this scenario that happen with one of my clients a while back.

Branch calls in reporting the "network" is down. After some phone troubleshooting I'm able to narrow it down to the onsite ESXI host. iLo is either dead or non-functional so I walk the user through bouncing it and reading off what they see. Array is toast.

So, let's say they had spare equipment onsite (they don't). I'm personally 2,000 miles away from them. One regional engineer is 7.5 hours away doing a DR test at a different client so he's not going to be free for at least few more hours on top of the travel time. The other engineer for the area is out of state on vacation. If this was a whitebox build, they'd be boned. Even if they had parts onsite, there's nobody onsite with the technical skillset to troubleshoot and replace server equipment.

But it's an HP under contract, so they have a guy onsite with parts in 4 hours. Suddenly that vendor maintenance isn't seeming quite so expensive. We've got 120 clients spread across the US plus some in the Caribbean totaling over 20,000 devices at this point (just counting servers/infrastructure). It's not financially feasible to have sufficient in house coverage to get a 4 hour response time with that kind of geographic area.

Even if I was to look at it on a per client basis (meaning from their side), it still doesn't work out. We have clients with locations that are 12+ hours apart and require a plane to get to. Meaning they would need dedicated engineers for each location to get a same day response time. That's $60k/yr per engineer to support what's usually a single, midtier x86 box (DL380/Poweredge R7x0).

Red Squirrel · Oct 28, 2016

Wouldn't you have techs at each geographical location? Seems like it's something that would make a lot of sense. You still need someone on site to let the IBM/HP/etc guy in the server room. I guess it depends how lax the security is, maybe the secretary or something can have an access fob to open the door.

XavierMace · Oct 28, 2016

Like I said, figure $60k/yr per engineer. More like $100k/yr in places like California. Some of our clients have 50+ locations. Having people at each location isn't financially sound for any business of size.

It's been a while since I was involved in the ordering process but back when I did presales configs, on a $3.5k HP build, 5yr 24x7 4hr support was about the same price as the server. $3.5k for no questions asked 5yr warranty, or $60k/yr (plus parts) for somebody to sit onsite in case it breaks?

Branch manager has access, it's their building, that's not lax security.

fleshconsumed · Oct 31, 2016

Yep, and ideally you'd need at least two or three engineers for every location in case one gets sick, and another one goes on vacation, you'd also have to make sure that they all can't take vacation at the same time, you'd have to make sure that at least one of them is always on call and all the associated headaches. Building your own stuff works when you're really small and can tolerate downtime, or when you're really big like google and building your own stuff and supporting it yourself can save major money. Otherwise for an enterprise customer who cannot tolerate downtime it's easier and cheaper to go support contract.

thecoolnessrune · Oct 31, 2016

fleshconsumed said:
Yep, and ideally you'd need at least two or three engineers for every location in case one gets sick, and another one goes on vacation, you'd also have to make sure that they all can't take vacation at the same time, you'd have to make sure that at least one of them is always on call and all the associated headaches. Building your own stuff works when you're really small and can tolerate downtime, or when you're really big like google and building your own stuff and supporting it yourself can save major money. Otherwise for an enterprise customer who cannot tolerate downtime it's easier and cheaper to go support contract.

You're thinking logically but these are enterprises we're talking about. We manage a $2 Billion customer that staffs 1 guy on a site all the time. Something goes down, they have to have the 1 guy drive up to 3 hours to go on-site.

All the Support Contracts in the world are useless when the 1 guy with the keys to the site is out and not answering his or her phone.

Why are 15K HDDs not dead?

Senior member

Senior member

Diamond Member

Lifer

Member

Senior member

Diamond Member

Senior member

No Lifer

Diamond Member

Lifer

Senior member

Member

Diamond Member

Senior member

Diamond Member

Diamond Member

No Lifer

Senior member

No Lifer

Diamond Member

No Lifer

Diamond Member

Diamond Member

Diamond Member