Hard drives aren't sexy.
That statement sums up in a nutshell the reason we have so few sources for reliable information on hard drive performance. In contrast to the latest crop of high performance video chipsets or CPU's, which have hardware reviewers swooning like teeny boppers, hard drives seem relegated to an esoteric corner of the web called StorageReview. No other site has been able or willing to match their steady output of competent reviews or their doggedly consistent methodology.
But like anyone at the top of their game, StorageReview is a little set in their ways. For one thing, their never-wavering choice of only two benchmarks to represent the entire performance spectrum of every hard drive released in the past 18 months has made their results somewhat difficult to swallow. Many hardware enthusiasts, accustomed to the reams of disparate and complementary benchmarks found in typical CPU and video reviews, find it difficult to accept that ZDNet Winbench 99 and Intel IOMeter are the final say on the performance of a modern hard drive. Further complicating matters is the fact that Winbench and IOMeter often contradict eachother; the Maxtor DiamondMax 60 Plus, for instance, gets top billing on Winbench but looks fairly ordinary under IOMeter.
Actually, StorageReview's extensive performance database contains four useful measures of disk performance:
1) IOMeter indices
2) Winbench 99 Disk Winmarks
3) Access times
4) STR measurements
Of these, two are synthetic, one is real-world, and one (IOMeter) is a flexible hybrid benchmark that simulates real world usage according to user-defined access parameters. Ideally, we could assign a relative weight to each benchmark to reflect its correlation to typical disk performance. By using a standard formula to average the data, we would be close to achieving that magic bullet of performance comparisons -- the single number that tells the whole story. But which benchmarks are the most important, and which are the least? The following exerts from StorageReview's Windows 2000 Testbed Disclosure Statement are probably our best guide:
(Please excuse the length -- skip it if you're anxious for the good stuff.)
"The best tests, of course, are common operations performed in a given application set. Doing so, again with human reflex/stopwatch is not as easily done as folks initially perceive. . . Macroing a standardized set of applications with a competent program for playback in the same sequence under the same conditions would be the best way to measure total system performance. . . while far from perfect, WinBench 99's Disk WinMarks are among the best-available scientific, standardized approaches to drive testing. . .
Over the last two years, we've used over 90 ATA and SCSI drives in our personal systems. We can attest through this sheer experience that performance and responsiveness as a whole correlate much more to WinBench than it does to file copies or other so-called "real world" measures.
That said, there are some legitimate concerns raised that WinBench, while accurate for testing the application workload it purports to measure, provides too light of a load on the system to represent performance on a more general basis. Further, as a given release of WinBench ages, manufacturers become better at "tuning" drive performance to reflect high numbers without a corresponding increase in actual performance. . . WinBench 99 is our "old faithful," but is getting a bit gray behind the ears. We've noticed certain instances (say, the Maxtor DiamondMax Plus 40 vs the Seagate Cheetah 36LP) where scores reported don't seem to correlate to our personal experiences. Hence the need for another good, corroborative benchmark: Enter IOMeter. . .
IOMeter isn't the user-friendliest program around, but it's definitely the most flexible. Unlike WinBench 99, which uses access patterns created by real applications, IOMeter's patterns are entirely synthetic. Such synthetic nature, however, yields incredible flexibility. . .
IOMeter can spawn multiple workers; Intel recommends one worker per CPU. Each worker, in term, can tax target (s) consisting of either an unpartitioned "physical disk" or partition(s) within a disk. Each worker must also be assigned a specific "access pattern," a series of parameters that guide the worker's access of a given target. . .
Generally speaking, it's evident that random access dominates typical workstation usage (and is thus the principal reason ThreadMark has been jettisoned). Even WinBench corroborates when pressed with the issue. The question is, how much randomness is sufficient?
Though the loading of executables, DLLs, and other libraries are at first a sequential process, subsequent accesses are random in nature. Though the files themselves might be relatively large, parts of them are constantly being sent to and retrieved from the swapfile. Swapfile accesses, terribly fragmented in nature, are quite random. Executables call other necessary files such as images, sounds, etc. These files, though they may represent large sequential accesses, consist of a very small percentage of access when compared to the constant swapping that occurs with most system files. Combined with the natural fragmentation that plagues the disks of all but the most dedicated defragmenters, these factors clearly indicate that erring on the side of randomness would be preferred. . .
Through recent tests in WinBench 99, however, empirical results indicated that STR had relatively little effect upon overall drive performance. Today, it should be clear that steadily-increasing transfer rates have in effect "written themselves out" of the performance equation. . . Judging from the above examples, it should be clear that random access time is vastly more important than sequential transfer rate when it comes to typical disk performance. Thus, the reordered "hierarchy" of important quantifiable specs would read:
1) Seek Time
2) Spindle Speed
3) Buffer Size
4) Data Density
Note that an important yet quantitatively-immeasurable factor is the nebulous drive electronics/firmware/algorithms package. Both of our benchmarks (WinBench 99 and IOMeter) are strongly influenced by these factors. . .
The often-contrary nature between StorageReview.com's new set of benchmarks begs the question: What's more important, WinBench 99 or IOMeter? In the past we've admitted how we use each drive we benchmark in our own personal systems to catch any cases where manufacturers are obviously padding WinBench results. In general, however, it's safe to say that we've been satisfied with the general correlation between WB99 results and perceived performance under normal, everyday use (something that would fall in between Light and Moderate Loads in our current IOMeter suite). . .
Another benefit in IOMeter's nature is its rather low-level nature. The fact that it doesn't possess a scripted access pattern to be played back (rather, it creates its own loads) makes it much more difficult for manufacturers to "tune" a drive for better scores without actually resulting in a proportional increase in actual usage. . .
The net result is that StorageReview.com from this point forwards will weigh IOMeter results, particularly of the Workstation Access Pattern, more highly than WinBench 99. This isn't to say that WinBench 99 is useless. . .
Again we come to the question of how to weight the four important benchmarks in StorageReview's database (IOMeter, Winbench99, access time and STR). From the above exerts, it is clear that IOMeter is a stronger benchmark than Winbench 99 for representing real world drive performance. And random access times are now considered much more important than STR. A glance at the access patterns StorageReview employs in their IOMeter tests will confirm what we already know from its results -- IOMeter is heavily access-time-dependent. Similarly, we find that Winbench 99 Winmarks scale much better with higher STR's than IOMeter inidices do.
After much soul searching and personal meditation (
) I've decided that the following weighting system would be most indicative of real world hard drive performance:
IOMeter Indices: 45%
Winbench 99 Disk Winmarks: 25%
Access time: 20%
STR's: 10%
This system follows two important guidelines: it conforms to the spirit of StorageReview's evaluation of important performance factors, and it stresses real world or hybrid benchmarks over purely synthetic measures.
To now apply this weighting system to StorageReview's database takes only a calculator and a little patience. Perhaps one day they and other hardware review sites will feature databases that allow one to customize the weighting of each benchmark into an aggregate index. Until then, we'll have to do it the old fasioned way.
The procedure to follow is this:
Go to StorageReview's Comparison Page.
Under "Compare Drive Performance of up to Six Drives", leave everything at its default and select up to six drives for comparison, clicking "Compare IOMeter Indicies" to continue. Scroll down to the three horizontal graphs. The normalized results are seen in your browser's status bar when you hover over a drive's bar. Average each drive's normalized results for the three IOMeter access patterns. This is our final IOMeter index for each drive.
Click Back on your browser and scroll down to "Compare Winbench and Threadmark 2.0 Results". Select up to six drives for comparison, clicking "Let's Compare!" to continue. For each drive, average its two normalized Winmark scores. This will be our final Winbench 99 index.
For access time, normalize all the drives in the comparison by finding the lowest access time, i.e. 11.5ms, and dividing by each of the other access times, e.g. 11.5ms / 13.1ms = 0.8779 = 88% normalized access time for the 13.1ms drive and 100% for the 11.5ms drive.
For STR, average the beginning and end scores for each drive (this assumes the STR descreases fairly linerarly over the platter surface, which is more or less true). Then, normalize the results, this time using the highest score as the quotient for each of the others. E.g. highest is 31000 so 28000 / 31000 = 0.903 = 90% for the 28000M/s drive and 100% for the 31000M/s drive.
Once we have the four final benchmark indexes for each drive, we simply multiply each by its corresponding weight and average them to arrive at a Total Performance Index for each drive. To start us off, I've done this for six of the most popular ATA/100 hard drives available today:
Drive-----------IOMeter Index---Winbench 99 Index-------Access time Index-------STR Index
Quantum LM-----------98.3------------78.5--------------------100---------------------77
IBM 75GXP------------99--------------92.5--------------------93----------------------93
WD 400BB-------------88--------------99----------------------83----------------------93
Quantum AS-----------86.3------------91----------------------87----------------------91
Maxtor 60Plus--------84--------------98.5--------------------88----------------------100
Maxtor 45Plus--------84.3------------92----------------------89----------------------89
Ranking of Total Performance Indexes with corresponding PriceWatch quotes for 30G drives:
(according to 45% IOMeter, 25% Winbench, 20% Access Time and 10% STR)
1) IBM Deskstar 75GXP----------------23.89-------------$119
2) Quantum Fireball LM Plus----------22.89-------------$106
3) Western Digital Caviar 400BB------22.56-------------$138 (40G only)
4) Maxtor DiamondMax 60 Plus---------22.51-------------$119
5) Quantum Fireball AS Plus----------22.02-------------$115
6) Maxtor DiamondMax 45 Plus---------21.91-------------n/a
What information can we glean from this kind of holistic performance comparison of currently competing hard drives? Two things stand out. First, StorageReview's "gut" feelings about the drives are extremely accurate. Their leaderboard featured, and continues to feature, the IBM 75GXP, which has the highest performance index by this system. The runner up was previously the Quantum LM+, which had the second highest index, but these days the LM+ is hard to find, so StorageReview fell back to the next in line after the LM+, the WD 400BB.
Second, and more importantly, modern hard drives of the same generation are remarkably similar performers. This performance index shows only a 9% difference between the slowest drive, the DiamondMax 45+, and the leader of the pack, the 75GXP. This leads to a buying decision focused mainly on price and availability. Assuming one can still find the Quantum LM+ for its PriceWatch list price, it is by far the best value. If not, the IBM 75GXP is, pleasantly, the best buy and the best performer.
Modus
That statement sums up in a nutshell the reason we have so few sources for reliable information on hard drive performance. In contrast to the latest crop of high performance video chipsets or CPU's, which have hardware reviewers swooning like teeny boppers, hard drives seem relegated to an esoteric corner of the web called StorageReview. No other site has been able or willing to match their steady output of competent reviews or their doggedly consistent methodology.
But like anyone at the top of their game, StorageReview is a little set in their ways. For one thing, their never-wavering choice of only two benchmarks to represent the entire performance spectrum of every hard drive released in the past 18 months has made their results somewhat difficult to swallow. Many hardware enthusiasts, accustomed to the reams of disparate and complementary benchmarks found in typical CPU and video reviews, find it difficult to accept that ZDNet Winbench 99 and Intel IOMeter are the final say on the performance of a modern hard drive. Further complicating matters is the fact that Winbench and IOMeter often contradict eachother; the Maxtor DiamondMax 60 Plus, for instance, gets top billing on Winbench but looks fairly ordinary under IOMeter.
Actually, StorageReview's extensive performance database contains four useful measures of disk performance:
1) IOMeter indices
2) Winbench 99 Disk Winmarks
3) Access times
4) STR measurements
Of these, two are synthetic, one is real-world, and one (IOMeter) is a flexible hybrid benchmark that simulates real world usage according to user-defined access parameters. Ideally, we could assign a relative weight to each benchmark to reflect its correlation to typical disk performance. By using a standard formula to average the data, we would be close to achieving that magic bullet of performance comparisons -- the single number that tells the whole story. But which benchmarks are the most important, and which are the least? The following exerts from StorageReview's Windows 2000 Testbed Disclosure Statement are probably our best guide:
(Please excuse the length -- skip it if you're anxious for the good stuff.)
"The best tests, of course, are common operations performed in a given application set. Doing so, again with human reflex/stopwatch is not as easily done as folks initially perceive. . . Macroing a standardized set of applications with a competent program for playback in the same sequence under the same conditions would be the best way to measure total system performance. . . while far from perfect, WinBench 99's Disk WinMarks are among the best-available scientific, standardized approaches to drive testing. . .
Over the last two years, we've used over 90 ATA and SCSI drives in our personal systems. We can attest through this sheer experience that performance and responsiveness as a whole correlate much more to WinBench than it does to file copies or other so-called "real world" measures.
That said, there are some legitimate concerns raised that WinBench, while accurate for testing the application workload it purports to measure, provides too light of a load on the system to represent performance on a more general basis. Further, as a given release of WinBench ages, manufacturers become better at "tuning" drive performance to reflect high numbers without a corresponding increase in actual performance. . . WinBench 99 is our "old faithful," but is getting a bit gray behind the ears. We've noticed certain instances (say, the Maxtor DiamondMax Plus 40 vs the Seagate Cheetah 36LP) where scores reported don't seem to correlate to our personal experiences. Hence the need for another good, corroborative benchmark: Enter IOMeter. . .
IOMeter isn't the user-friendliest program around, but it's definitely the most flexible. Unlike WinBench 99, which uses access patterns created by real applications, IOMeter's patterns are entirely synthetic. Such synthetic nature, however, yields incredible flexibility. . .
IOMeter can spawn multiple workers; Intel recommends one worker per CPU. Each worker, in term, can tax target (s) consisting of either an unpartitioned "physical disk" or partition(s) within a disk. Each worker must also be assigned a specific "access pattern," a series of parameters that guide the worker's access of a given target. . .
Generally speaking, it's evident that random access dominates typical workstation usage (and is thus the principal reason ThreadMark has been jettisoned). Even WinBench corroborates when pressed with the issue. The question is, how much randomness is sufficient?
Though the loading of executables, DLLs, and other libraries are at first a sequential process, subsequent accesses are random in nature. Though the files themselves might be relatively large, parts of them are constantly being sent to and retrieved from the swapfile. Swapfile accesses, terribly fragmented in nature, are quite random. Executables call other necessary files such as images, sounds, etc. These files, though they may represent large sequential accesses, consist of a very small percentage of access when compared to the constant swapping that occurs with most system files. Combined with the natural fragmentation that plagues the disks of all but the most dedicated defragmenters, these factors clearly indicate that erring on the side of randomness would be preferred. . .
Through recent tests in WinBench 99, however, empirical results indicated that STR had relatively little effect upon overall drive performance. Today, it should be clear that steadily-increasing transfer rates have in effect "written themselves out" of the performance equation. . . Judging from the above examples, it should be clear that random access time is vastly more important than sequential transfer rate when it comes to typical disk performance. Thus, the reordered "hierarchy" of important quantifiable specs would read:
1) Seek Time
2) Spindle Speed
3) Buffer Size
4) Data Density
Note that an important yet quantitatively-immeasurable factor is the nebulous drive electronics/firmware/algorithms package. Both of our benchmarks (WinBench 99 and IOMeter) are strongly influenced by these factors. . .
The often-contrary nature between StorageReview.com's new set of benchmarks begs the question: What's more important, WinBench 99 or IOMeter? In the past we've admitted how we use each drive we benchmark in our own personal systems to catch any cases where manufacturers are obviously padding WinBench results. In general, however, it's safe to say that we've been satisfied with the general correlation between WB99 results and perceived performance under normal, everyday use (something that would fall in between Light and Moderate Loads in our current IOMeter suite). . .
Another benefit in IOMeter's nature is its rather low-level nature. The fact that it doesn't possess a scripted access pattern to be played back (rather, it creates its own loads) makes it much more difficult for manufacturers to "tune" a drive for better scores without actually resulting in a proportional increase in actual usage. . .
The net result is that StorageReview.com from this point forwards will weigh IOMeter results, particularly of the Workstation Access Pattern, more highly than WinBench 99. This isn't to say that WinBench 99 is useless. . .
Again we come to the question of how to weight the four important benchmarks in StorageReview's database (IOMeter, Winbench99, access time and STR). From the above exerts, it is clear that IOMeter is a stronger benchmark than Winbench 99 for representing real world drive performance. And random access times are now considered much more important than STR. A glance at the access patterns StorageReview employs in their IOMeter tests will confirm what we already know from its results -- IOMeter is heavily access-time-dependent. Similarly, we find that Winbench 99 Winmarks scale much better with higher STR's than IOMeter inidices do.
After much soul searching and personal meditation (
IOMeter Indices: 45%
Winbench 99 Disk Winmarks: 25%
Access time: 20%
STR's: 10%
This system follows two important guidelines: it conforms to the spirit of StorageReview's evaluation of important performance factors, and it stresses real world or hybrid benchmarks over purely synthetic measures.
To now apply this weighting system to StorageReview's database takes only a calculator and a little patience. Perhaps one day they and other hardware review sites will feature databases that allow one to customize the weighting of each benchmark into an aggregate index. Until then, we'll have to do it the old fasioned way.
The procedure to follow is this:
Go to StorageReview's Comparison Page.
Under "Compare Drive Performance of up to Six Drives", leave everything at its default and select up to six drives for comparison, clicking "Compare IOMeter Indicies" to continue. Scroll down to the three horizontal graphs. The normalized results are seen in your browser's status bar when you hover over a drive's bar. Average each drive's normalized results for the three IOMeter access patterns. This is our final IOMeter index for each drive.
Click Back on your browser and scroll down to "Compare Winbench and Threadmark 2.0 Results". Select up to six drives for comparison, clicking "Let's Compare!" to continue. For each drive, average its two normalized Winmark scores. This will be our final Winbench 99 index.
For access time, normalize all the drives in the comparison by finding the lowest access time, i.e. 11.5ms, and dividing by each of the other access times, e.g. 11.5ms / 13.1ms = 0.8779 = 88% normalized access time for the 13.1ms drive and 100% for the 11.5ms drive.
For STR, average the beginning and end scores for each drive (this assumes the STR descreases fairly linerarly over the platter surface, which is more or less true). Then, normalize the results, this time using the highest score as the quotient for each of the others. E.g. highest is 31000 so 28000 / 31000 = 0.903 = 90% for the 28000M/s drive and 100% for the 31000M/s drive.
Once we have the four final benchmark indexes for each drive, we simply multiply each by its corresponding weight and average them to arrive at a Total Performance Index for each drive. To start us off, I've done this for six of the most popular ATA/100 hard drives available today:
Drive-----------IOMeter Index---Winbench 99 Index-------Access time Index-------STR Index
Quantum LM-----------98.3------------78.5--------------------100---------------------77
IBM 75GXP------------99--------------92.5--------------------93----------------------93
WD 400BB-------------88--------------99----------------------83----------------------93
Quantum AS-----------86.3------------91----------------------87----------------------91
Maxtor 60Plus--------84--------------98.5--------------------88----------------------100
Maxtor 45Plus--------84.3------------92----------------------89----------------------89
Ranking of Total Performance Indexes with corresponding PriceWatch quotes for 30G drives:
(according to 45% IOMeter, 25% Winbench, 20% Access Time and 10% STR)
1) IBM Deskstar 75GXP----------------23.89-------------$119
2) Quantum Fireball LM Plus----------22.89-------------$106
3) Western Digital Caviar 400BB------22.56-------------$138 (40G only)
4) Maxtor DiamondMax 60 Plus---------22.51-------------$119
5) Quantum Fireball AS Plus----------22.02-------------$115
6) Maxtor DiamondMax 45 Plus---------21.91-------------n/a
What information can we glean from this kind of holistic performance comparison of currently competing hard drives? Two things stand out. First, StorageReview's "gut" feelings about the drives are extremely accurate. Their leaderboard featured, and continues to feature, the IBM 75GXP, which has the highest performance index by this system. The runner up was previously the Quantum LM+, which had the second highest index, but these days the LM+ is hard to find, so StorageReview fell back to the next in line after the LM+, the WD 400BB.
Second, and more importantly, modern hard drives of the same generation are remarkably similar performers. This performance index shows only a 9% difference between the slowest drive, the DiamondMax 45+, and the leader of the pack, the 75GXP. This leads to a buying decision focused mainly on price and availability. Assuming one can still find the Quantum LM+ for its PriceWatch list price, it is by far the best value. If not, the IBM 75GXP is, pleasantly, the best buy and the best performer.
Modus