qd > 1 workloads?

boogerlad

Junior Member
Aug 28, 2009
8
0
66
Hi, I'm curious what client/consumer workloads there are that have a queue depth greater than 1. Please list transfer sizes and if it's random or not. I know any database software with sufficient number of connections and queries may trigger a high queue depth.

Many manufactures provide the 4k random write iops number nearing 100000, but is it a meaningless number? Correct me if I'm wrong but most of these are not steady state/best case performance with queue depth ranging from 32 to 128. Therefore, it's not directly comparable unless reviewed by a common reviewer right? At Anandtech, would this be the closest thing to verifying manufacture's claims?
64288.png


Doing the math of iops*transfer size seems to have numbers closest to that. However, one must look at the "performance consistency" page to see the worst case performance right?
 

razel

Platinum Member
May 14, 2002
2,337
93
101
qd > 1 with an SSD is going to be rare for the home user. Even when you do get to above 1 you wouldn't care. iops is not a meaningless number. I think you just misunderstand what iops means versus what the Anandtech graph is showing.

You are digging waaay to far looking for meaning with the benchmarks. I'd step back into reality and just enjoy the state of storage. It's amazing that we are capable to saturating SATA and can seriously even in the short future saturate the CPU/PCH link bandwidth which I think is roughly 20GB.

We do not process information that quickly so the next SSD game will be latency.
 
Last edited:

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
You pretty much have it, but QD on SATA maxes out at 32. The specs manufacturers use are typically either fresh, or with tons of free space and TRIM, and a specifically tuned workload (for SF drives, almost always ATTO). But, there's not much to worry about. You get a newish gen SSD, with fast NAND, run it in AHCI or RAID, and then use your PC.

QD > 1 will not be uncommon, but only just (>4 is going to be on the rare side of things). Anything that reads or writes more than once at a time to non-adjacent addresses, including reading fragmented files. Dealing with thumbnails and general Explorer metadata indices, FI, or listing any big directory.

Very high queue depths need high IO concurrency, which generally only happens, with a single user, when installing or removing software, or syncing lots of small files (like a source tree). Such work is so fast on the slowest modern SSDs, that you won't notice a difference.
 

boogerlad

Junior Member
Aug 28, 2009
8
0
66
iops is not a meaningless number. I think you just misunderstand what iops means versus what the Anandtech graph is showing.

What is the meaning of it then, if there are very few workloads that will come close to that number? What is the proper way of interpreting AnandTech's graph then? I'm not concerned about performance. I'm very satisfied with my current storage setup. I just want to increase my knowledge.
 

Wall Street

Senior member
Mar 28, 2012
691
44
91
What is the meaning of it then, if there are very few workloads that will come close to that number? What is the proper way of interpreting AnandTech's graph then? I'm not concerned about performance. I'm very satisfied with my current storage setup. I just want to increase my knowledge.

For it to be meaningless would mean that NO workloads come close to that number, but some still do. For example, the servers that house databases could very well require very high 4k QD32 performance because they can generate that kind of scenario when thousands of users demand simultaneous access or when a search requires the deep inspection of the data. For these databases, using SSDs is almost like adding more RAM, everything just feels faster.

It is meaningless if you are deciding what to put in your home PC. Home PCs need to run on spinning platters that operate at ~100 IOPS. It is very hard to feel the difference between a 500x speedup and a 1,000x speedup. For example, a program that loads in 30 seconds on a 100 IOPS hard drive (3000 random accesses) will load in 0.06 seconds at 50,000 IOPS and 0.03 seconds at 100,000 IOPS. At this point, most programs fall back on the CPU as the loading bottleneck or are fast enough to feel instant on any SSD.
 

boogerlad

Junior Member
Aug 28, 2009
8
0
66
I'm aware of that. It just seems to be a meaningless marketing tactic for consumer drives. Enterprise drives on the other hand I completely understand why, but their datasheets are far more verbose.
 

razel

Platinum Member
May 14, 2002
2,337
93
101
It just seems to be a meaningless marketing tactic for consumer drives.

That is the eventual answer to most of what you asked.

Another answer: IOPS - Input/output per second. The graph you showed is MB/s not IOPS. You'll need to find another review that shows IOPS. pcper does.

By the way I just went through pcper's video review of Intel's NVMe DC P3700. Wow. At queue depth of 1 it does more IOPS (50k) than the best SSD at queue depth of 32 (40k). The CPU is also processing so much information that they were getting 50% CPU usage and NVMe is more efficient software-wise than the SATA. Stunning.

It's also $1.50 per GIG for the lowest enterprise model. I could be wrong about your upgrade... I think PCIe NVMe is what you'll want.
 
Last edited:

Deders

Platinum Member
Oct 14, 2012
2,401
1
91
I'm aware of that. It just seems to be a meaningless marketing tactic for consumer drives. Enterprise drives on the other hand I completely understand why, but their datasheets are far more verbose.

My guess is that 10's of thousands of iops looks far more impressive marketing wise than 20MB/s for 4k reads for instance. 20MB/s is still 100x faster than a spinning drive.
 

boogerlad

Junior Member
Aug 28, 2009
8
0
66
@razel
I posted that graph aware that it in mb/sec. The last line of my post stated that you can convert iops to mb/sec by simply multiplying iops by the transfer size, which in this case is 4kb/1024 to get mb/sec.
 

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
What is the meaning of it then, if there are very few workloads that will come close to that number? What is the proper way of interpreting AnandTech's graph then? I'm not concerned about performance. I'm very satisfied with my current storage setup. I just want to increase my knowledge.
IMO, they should, every few reviews, add in a plain 7200 RPM HDD to the graph, and one good old SSD. That would make the useful interpretations a lot easier.

In all honesty, most SSDs are simply fast enough, in terms of throughput, so long as they aren't filled up, for client use. I do put some to use with 2-3 VMs at a time, but even then, anything using current-gen controllers, with sync NAND, is about as good as any other. I go by $/GB, apparently likelihood of after-sale issues, and expected quality and speed of after-sale service, more than performance, for the most part.

Most of the meaning has gone away, as SATA SSDs have improved. Even good older SSDs, like Intel's X25 and 320, or Crucial's M4, could be pushed into a corner and get to performing worse than HDDs. Now there are a handful of potentially-performance-fragile models, but most have worst-case performance in the range of 1k-5k IOPS, and on average, increasing every generation. With small price increases for higher performance consistency, there's little reason not to stack the deck, sometimes, but it's not worth paying a ton more for a faster SSD, generally, when the $/GB sweet spots are so fast.

Overall, reducing avg/max latency much further, which will require NVMe and then software optimization (where in the OS? Where in the firmwares? Until more NVMe drives come out, that won't be known well enough, much less how to handle it), is the next step, as throughput has largely surpassed any non-server user's needs, and controllers made for NVMe will surely come out with far more bandwidth capability than we see today, over the next few years.
 

SSBrain

Member
Nov 16, 2012
158
0
76
Keep in mind that the faster an SSD can process random small block data (in other words, the more IOPS it's able to process), the shorter the queue becomes for a given workload. Under real life usage scenarios you don't usually "have" a fixed QD32 workload, but one that can fill the queue up to (and over, at least at OS level) that depth if the SSD can't gulp it fast enough.

On Windows, you can check out the average read and write queue depth by using the Performance Monitor and choosing the appropriate performance indicator under "Physical Disk".

This is what happens when I boot up a Windows 8 basic install virtual machine under a VMware host. Data represent the average queue depth since the last sample. Here, 1 sample = 1 second:

34hH8W7.png


When booting up the VM, the average read queue depth reached 8 in one second, although instant values might have been higher in this case.
I have relatively fast SSD, a Sandisk Extreme II 480GB.

This is from a different virtual machine. I booted up Ubuntu 14.04, then made it download and apply 100MB of patches. Then I restarted the OS from the VM:

VUCIOOB.png


During normal (consumer) usage the queue depth doesn't get very high for too long, but it can get in bursts (not sustained) to relatively high values. I think this is true especially if the SSD is not very quick.

However, keep in mind that native SSD performance will be better than under a VM. Data queues therefore will likely be shorter with actual (native) usage.
 
Last edited: