Want to be part of AnandTech Storage Suite 2013? Read this

Hellhammer

AnandTech Emeritus
Apr 25, 2011
701
4
81
I'm working on a new trace for AT Storage Bench 2013 and would appreciate some help. My idea is to collect IO traces from as many sources as possible and I think reaching to our readers will provide a variety of different traces from different users, which is ultimately the best scenario. There are as many workloads as there are users, so the more traces we have, the better picture we'll have about how people use their computers. I would specifically appreciate help from video and audio professionals and others who have more unique workloads (I can't recreate those on my own, but ALL help is appreciated, regardless of your workload).

We can incorporate the data you send us into our tests by knowing more about the IO and QD size distribution for example. We can also run the trace you send us on any SSD, or combine multiple traces into one massive trace.

Interested? Read the instructions below!

1. Install Windows Performance Tool Kit. It can be in your system already (type "performance" to the search and see what it finds) but in case it is not, just install it from Windows 7 SDK:

http://www.microsoft.com/en-us/download/details.aspx?id=3138

You don't have to install the whole SDK, just choose the Performance Tool Kit when it allows you to customize the installation.

2. Reboot.

3. Run Command Prompt as an administrator.

4. To start tracing, use the following command:

Code:
xperf -on diageasy+file_io

You can sleep the system and the trace will continue running when resuming. You can run the trace as long as you want as well (I recommend at least a couple of days unless you're recording a specific workload).

5. To stop the trace, use the command:

Code:
xperf -d tracename.etl

Tracename in the code can be whatever you like, although I suggest keeping it short and simple (trace1, trace2 etc).

6. Convert the trace you just created to CSV file by using the command:

Code:
xperf -i tracename.etl > tracename.csv

Where tracename is the name of the file you gave in step 5 (keep it the same throughout this process).

7. The trace file should be in C:\Users\user folder. Trace files can be huge in size (our Light trace generated a 5.5GB CSV file...), so use your favorite compression tool to compress the file (the data is highly compressible, the 5.5GB CSV file shrank to 250MB after WinRAR treatment). I prefer ZIP or RAR files but I'm open to suggestions if there are better compression formats.

8. Once you have compressed the CSV file, upload it to somewhere and send the download link to me at kristian@anandtech.com. Describe your usage shortly so I can better understand the workload we are dealing with. If you use multiple disks, please list all disks (use Windows Disk Management to see the disk numbers) and their main purpose (boot, media storage etc).

NOTE: Be aware that your trace may contain personal information. We will NOT share the data you send us with anyone and we only use it for analyzing purposes.

NOTE2: The trace files can end up being multiple GBs each, so make sure you have enough space on your C drive before you start tracing.

I think this is all. If you have any questions or problems, just fire away and I'll try to help.
 

Ao1

Member
Apr 15, 2012
122
0
0
Traces are flawed in my opinion. IPEAK splits I/O's above 128K during the trace and then replays the split I/O at a higher level within the I/O stack to what would normally occur. This error is further compounded by reviewers using Service MB/s as an output metric from IPEAK, when the metric means nothing to end users. (Which is why Intel recommend to not use Service MB/s for client work load analysis).

Regardless of the trace playback method I have been unable to find a trace that can be played back to faithfully reproduce the original I/O. Queue depths get distorted, especially if they are played back with idle times removed.

Why not ask SNIA for the analysis results of their I/O capture program?

http://snia.org/forums/sssi/wiocp

For client use response times are what need to be bench marked.
 

JellyRoll

Member
Nov 30, 2012
64
0
0
Can you provide instructions for those of us that would wish to replay the trace ourselves?
 
Last edited:

Hellhammer

AnandTech Emeritus
Apr 25, 2011
701
4
81
Traces are flawed in my opinion.

There is no perfect way to test SSDs. Synthetic tests show how the drive performs in a certain situation but real life workloads often consist of a mix of various IO sizes and queue depths. Sure running read/write tests at various IO sizes and QDs gives a decent idea of how the drive performs but it doesn't show how the drive performs when it's subjected to both reads and writes at multiple IO sizes and QDs simultaneously.

I know traces have their limitation, which is quite obvious because you're trying to mimic possibly weeks of IO activity within a short period of time. However, playing the trace in real time would be far too slow for review purposes as we often have less than a week to review the drive before the NDA is lifted.

I have played around with hIOmon but I haven't really found a way to utilize it for benchmarking. It could be used to trace the IO for specific tasks (like loading games), but we would be dealing with a limited amount of scenarios, which might not build a big picture of the performance. There is also the OS variable added to the mix. I know some of you have asked us to do more tests with file system in use but it's tough because the OS really complicates things (IOs can be subject to caching for example, so interpreting the results would be far more difficult because we can't know if something was due to the OS or due to the SSD). Quite a few IO tools only support testing at a device level as well.

Tests also need to be easily and accurately reproducible, so using hIOmon to monitor IO activity for lets say a week for each review would not be accurate as my usage might be heavier one week and lighter the other week.

My main goal with the traces is to study how people use their machines, so we can pay more emphasis to certain IO sizes and queue depths in our tests. I'm not aiming to build a massive trace and say that it tells you everything about SSD performance.

I'm always open to suggestions on how to improve our tests, so your feedback is very welcome.

Can you provide instructions for those of us that would wish to replay the trace ourselves?

Unfortunately I'm not aware of a commercially available tool that would playback the ETL trace created by XPerf. The tool we use is under very strict NDA, so I can't share it.
 
Last edited:

JellyRoll

Member
Nov 30, 2012
64
0
0
Im sure it is IPEAK , so there will be no TRIM during playback. There is some fuzziness as to the worth of these types of utilities that record and then playback without a filesystem.
 

Rubycon

Madame President
Aug 10, 2005
17,768
485
126
What about those of us using complex and not so typical storage scenarios?
I'm sure the smart host and infinband wired RAM SAN containers would surely produce "burst at the seam" numbers. :biggrin:
 

jwilliams4200

Senior member
Apr 10, 2009
532
0
0
Traces are flawed in my opinion.

I agree. For comparing similar SSDs, traces are almost useless, since the results depend critically on the peculiarities of the program used to record and playback the traces.

For example, both anandtech.com and techreport.com used traces to test the Samsung 840 Pro and OCZ Vector SSDs. If you look at the statistics for read/write mix, block size, and QD, both traces are similar. But anandtech.com has the 840Pro beating the Vector, and techreport.com has the Vector beating the 840Pro. Neither review compared their results on the trace to other reviews. When asked, techreport.com people were unable to explain why their result differs from anandtech.com, and I suspect anandtech.com people will also be unable to explain the difference.

Far superior to traces is to pick some common tasks that SSDs will be likely to be used for, and then simply run the tasks (in realtime) while monitoring the speed with an IO monitoring program (and probably a stopwatch as well, until you have enough experience with the IO monitoring program that you are certain it is accurate and dependable).

If you feel you must use traces, then I suggest using a sampling of several short and simple realtime traces. Similar to the previous suggestion, just pick some common tasks, and record traces of them, then play them back. Since they are realtime, it eliminates the flaw of playing back hours or days of activity in minutes (a flaw which allows the SSD less time for GC, and probably unrealistically increases the QD depending on the playback software). Since the traces are short, you will have time to play them back in realtime during the review period. Since the traces are simple, it will be easier for everyone to interpret the results, and people can pick and choose the traces that are most important to them when evaluating SSDs.

But I still think actually running a number of common tasks and recording the performance is the way to go. That way you are not held hostage to the flaws of your trace recording and playback software.
 
Last edited:

JellyRoll

Member
Nov 30, 2012
64
0
0
I agree, there are just so many uncontrollable aspects.
None of these tools such as ipeak or swat are under any type of NDA, and by using 'secret' programs readers cannot test to see if the results are accurate. This is probably just an attempt to keep other sites from copying testing procedures, but in reality if tested by readers they would merely see that these tests aren't reproducible.
There are too many factors, especially idle times, etc.
Testing without a filesystem is also removing TRIM from the equation. This is one of the most important aspects of SSD performance, yet it is not included in the testing.
Somewhere along the line someone needs to stop and think "Hey, recording two weeks of workload and playing it back in 8 minutes doesn't make sense."
 

Ao1

Member
Apr 15, 2012
122
0
0
My main goal with the traces is to study how people use their machines, so we can pay more emphasis to certain IO sizes and queue depths in our tests.

That is why I suggested contacting SNIA. They are working on a white paper to cover just that.

IPEAK makes some very strong recommendations on how to analyse the trace and report results.

In the data server scenario, the important fact is that the benefit to the user is the amount of I/O activity that can be performed in a given amount of time. Time is fixed and amount of work done is variable.

The opposite occurs with client activity. A client user who upgrades to a drive that is 10 times faster in MB/s will not read 10 times more e-mail, open 10 times as many applications, or listen to 10 times as much music. We have measured I/O usage for systems with HDDs and SSDs and seen little difference. Users are thrilled by switching to an SSD, but that's because they experience fewer delays, not because they perform proportionately more I/O activity. For clients, amount of work is fixed and the time is variable, not the other way around. And that reversal means that it is the time metric that is proportional and the MB/s metric which is distorted.

Excluding the fact that IPEAK alters I/O whilst recording it is response times, not service time MB/s that should be the focus of the trace results if you want to show a client based reader which SSD is fastest.

It is possible (and quite easy) to isolate specific tasks and benchmark performance with hIOmon.

I guess it depends what you want to achieve. Demonstrate which SSD is technically the fastest or demonstrate which is fastest for real world I/O.
 

Hellhammer

AnandTech Emeritus
Apr 25, 2011
701
4
81
But I still think actually running a number of common tasks and recording the performance is the way to go. That way you are not held hostage to the flaws of your trace recording and playback software.

I definitely agree, real world tasks tell the most about how the drive really performs. The only problem I have is coming up with real world tests that are easily reproducible and intensive enough to show differences between SSDs. Application install time (something big like a game) and an anti-virus scan are the first that come to my mind. Maybe even application launch time as well, though that would have to be several apps and probably traced one way or the other as well (launch apps in real time but then use the trace to look at performance, no trace playback involved).

I don't think tests like Windows boot time or file copying will be useful, though I'm guessing we can all agree on that.

If there are any specific tasks you would like to be tested, fire away. I'm fairly limited to my own usage, which to be honest isn't exactly heavy. Especially something involving video/photo/audio would be great as personally I have zero experience with the software used in those fields. I don't think we'll totally get rid of trace based tests but real world tests would of course be a great addition.
 

Ao1

Member
Apr 15, 2012
122
0
0
Here is an example to compare the results from the Anandtech Heavy Workload 2011 - Avg Read Speed from one of Anandtech's reviews. The Samsung 840Pro comes out with a score of 154 and the Crucial M4 comes out with a score of 101.4. The 840Pro has a score that is 51.84% higher than the Crucial M4. You could be forgiven for thinking that the M4 read performance sucks if you went by those results.


Uploaded with ImageShack.us

Lets now compare results from Crysis 2 with hIOmon using an onboard LSI controller. The Samsung 840Pro has a DXTI score of 285.878. The Crucial M4 has a score of 270.392. The 840Pro has a score that is 5.72% higher than the Crucial M4.

The DTS metrics are focused on burst I/O activity. Notice that DTS's above 128 KiB are identical. Even if I use RAID 0 x 8 the max MB/s for 1 MiB & 2 MiB DTS remains the same.

Those burst I/O DTS results are based on sequential read I/O and the M4 really shines with that workload.

Crucial M4


Uploaded with ImageShack.us

Samsung 840 Pro

Uploaded with ImageShack.us
 

Hellhammer

AnandTech Emeritus
Apr 25, 2011
701
4
81
Here is an example to compare the results from the Anandtech Heavy Workload 2011 - Avg Read Speed from one of Anandtech's reviews. The Samsung 840Pro comes out with a score of 154 and the Crucial M4 comes out with a score of 101.4. The 840Pro has a score that is 51.84% higher than the Crucial M4. You could be forgiven for thinking that the M4 read performance sucks if you went by those results.

Keep in mind that the Heavy Workload isn't just sequential reads and writes. Around 30% of the read IOs are sequential, the rest vary (some are more random than others). The 840 Pro is over 70% faster at 4KB random reads (QD3), which is part of the reason for its high average read speed in Heavy suite:

50066.png


The m4 has fairly good sequential read performance, but it's random IO performance is lacking.
 

Ao1

Member
Apr 15, 2012
122
0
0
And how important are 4K random reads for read intensive tasks? ;) (and how often are wrirtes/ read alighned in real world I/O?)

There is one other important issue about trace results to add to the list - compression.

Below are Crysis results for a Hyper X drive (same capacity as the 840Pro and M4). Real world I/O performance sucks for SF based drives and it is not much better for writes that can rarly be compressed in client generated I/O patterns, yet a SF based drive comes in second in the chart I linked earlier.

So major issues with trace include:

· The trace modifyling the I/O workload
· The app used to play back the trace modifing the I/O worload resulting in
o No idle time to allow GC to work (assuming idle times periods are removed)
o No TRIM
· Inflated queue depths
· A lack of clarity on how the playback impacts compressability
· A lack of clarify on the workload used for the trace and the difficulty in seperating burst periods and low use periods

Uploaded with ImageShack.us
 

Hulk

Diamond Member
Oct 9, 1999
5,155
3,769
136
This discussion is very interesting and honestly somewhat over my head. As far as real world situations that stress the drive I can only think of a few situations that seem to stress my SSD:

- Opening 20 or more large TIFF files in photoshop. That's something I often do after converting them from RAW to TIFF and it always takes quite a while.
- Performing an incremental backup using FreeFileSync. This is often a mix of very small and very large files. Actually just backing up the contents of a hard drive filled with many large and small files.
- After loading a video file into Sony Vegas Pro, waiting for the waveform to build.
- Building a m2ts file using ClownBD from a BD which has been ripped.
- Opening applications such as CorelDraw, Sony Vegas Pro, Presonus Studio One, and starting Windows. I know it has been stated they aren't good metrics but they are things that I constantly wait on, even with an SSD.

As I wrote above I'm not a storage expert and when I added an SSD to my system most of my drive based delays simply "went away." These are the ones which have remained. I do quite a bit of audio and video work using Presonus Studio One 2 and Sony Vegas Pro 10 and honestly besides building that audio waveform in Vegas these applications aren't drive bottle necked. Preview and rendering performance is what is needed and that is only helped with more processing power.
 

jwilliams4200

Senior member
Apr 10, 2009
532
0
0
I definitely agree, real world tasks tell the most about how the drive really performs. The only problem I have is coming up with real world tests that are easily reproducible and intensive enough to show differences between SSDs. Application install time (something big like a game) and an anti-virus scan are the first that come to my mind. Maybe even application launch time as well, though that would have to be several apps and probably traced one way or the other as well (launch apps in real time but then use the trace to look at performance, no trace playback involved).

I don't think tests like Windows boot time or file copying will be useful, though I'm guessing we can all agree on that.

Actually, I do not agree with that last statement. I see no urgency to choose only tasks that will differentiate SSDs. Sure, it is helpful to have some tasks that differentiate SSDs. But most important is to have a large selection of tasks that are of interest to people reading the review. If some of those tasks show that most SSDs are the same speed, then that is actually a useful result -- people will know that if those tasks are important to them, then they should select an SSD using criteria other than performance (quality, price, etc.)

I think that it would be useful to just go through a common SSD computer install with a stopwatch.

Install windows on the SSD and record the time taken. Install the applications that you are going to use, recording the time taken. Copy some data over that you are going to use, recording the time taken (you could have a machine with 32GB of RAM and copy the data from a ram drive, so this could be one of your differentiating tasks, even if it is not especially realistic for most users). Always copy enough data to fill the SSD up to some fixed amount, say 75% (probably a common amount for many SSD users).

Boot the computer and record the time taken with a stopwatch (of course, do it several times, I'd say 5 times, and give the minimum, average, and maximum times). I know there are programs like bootracer that do this automatically, but I'm not sure I trust the accuracy of those programs. A stopwatch may be less precise, but anyone with decent reflexes can get within a fraction of a second, and that is more than good enough when measuring boot times (and you'll be showing min, avg, max so it will be apparent how much variation there is)

Have a script that launches several applications in parallel, some with documents to load, and record the time taken.

Similar to booting, but hibernation. Have several programs in memory (with documents loaded) and time how long it takes to hibernate the computer, and then how long it takes to return from hibernation.

Record starting times and level loading times for several games (I'm not a gamer, so I cannot suggest specifics, but it should not be hard to find people to give suggestions for those tasks).

For video processing, just try some basic storage-bound tasks. For example, use MKVExtract and MKVMerge (part of MKVToolnix) to split the parts of a big video file. Then merge the parts back together. If you don't have a convenient MKV file, just use MakeMKV to rip a blu-ray to get your MKV file (probably you should have a couple sizes of MKV files, a smaller one for 128GB SSDs, maybe a DVD rip, and a larger one, bluray rip, for 256GB and 512GB SSDs).

Perhaps someone can come up with a common database task. I don't use Outlook, but maybe someone can suggest a task that takes a while, something involving processing a lot of Outlook data (email, appointments, etc.) and is storage-bound.

I've never tried to do a Windows system image saved to the same disk that you are imaging, but if Windows will let you do that, it might be a decent test. More realistic would be to system image a separate disk to the SSD under test, but then you will need to keep a separate system disk around to boot from and to serve as the source for the test. Could also use the SSD under test as the source for a system image, and save to a fast HDD like a Velociraptor.

I often find myself staring at the screen when I manually run a Windows update, after the files have been downloaded. Windows seems to take a long time to actually install the updates. I'm not sure if it is storage-bound, CPU-bound (lots of decompression?), or just software-bound (lots of pauses built-in to the program). I'm having a hard time thinking of a way you could make a repeatable test out of Windows update, though. I guess you could figure out where it saves its update files after it downloads them, and then keep a copy of some to use. But then you would be stuck with that version of Windows for the benchmark. Maybe this is one of the cases where a trace might be useful. After installing Windows from DVD, record a trace of Windows update running, then you can play that trace back as a benchmark for SSDs.

As you say, run a virus scan. Probably one of the last tasks you should run, so that the SSD is in a used state. Again, run it several times and report min, avg, max (since OS caches may speed things up on successive runs...maybe reboot between runs to minimize that effect).
 
Last edited:

Coup27

Platinum Member
Jul 17, 2010
2,140
3
81
Importing and exporting mailboxes is probably the most taxing thing you can do in Outlook. My boss' mailbox is about 15GB and before we moved from POP3 to Exchange it used to take about an hour to transfer his mailbox and import it into his new Outlook. I can probably provide a ~15GB Outlook PST file for you to play with if you wanted to go down that route.
 

Hulk

Diamond Member
Oct 9, 1999
5,155
3,769
136
jwilliams4200,

I like your suggestions a lot. It would be nice to get away from purely synthetic tests or synthetic tests that try to simulate real world situations and just physically time the real world situations that actually cause computer delays. Just pick 10 or so different tasks and get times on those, figure out a weighting and create an overall score. Of course provide the actual times for each task as well. We can see piece together the rest of the puzzle by using AS SSD or some other such benchmark.
Let's not over complicate this.

Remember the old Winstone test where it actually loaded and ran application scripts? It was THE benchmark for computer speed for many years and for good reason, it used actual applications. We need something like Winstone that actually times boot up time, loading applications, running an application script if it can be created to be storage based (otherwise just open and close the app), and then finally shut down the computer and save the results in a log file.
 

Ao1

Member
Apr 15, 2012
122
0
0
From an IBM Paper, which is useful in that it seperates throughput tasks and I/O intensive tasks.

Uploaded with ImageShack.us

It's not so easy to find heavy storage I/O tasks that do not end up being CPU or GPU bound. Botton line is that storage is no longer the bottleneck

Exchange Server
http://technet.microsoft.com/en-us/library/bb125019(v=EXCHG.65).aspx
Estimated IOPS per User for User Type × Number of Users
In this example, .75 IOPS × 2,000 mailboxes=1,500 IOPS.
Using a conservative ratio of two reads for every write (66 percent reads to 33 percent writes), you would plan for 1,000 read I/O and 500 write I/O requests per second for your database volume. Every write request is first written to the transaction log file and then written to the database. Approximately 10 percent of the total 1,500 IOPS seen on the database volume will be seen on the transaction log volume (10 percent of 1,500 is 150 IOPS); 500 write I/O requests will be written to the database.
 

Ao1

Member
Apr 15, 2012
122
0
0
jwilliams4200,

I like your suggestions a lot. <Snip>

Remember the old Winstone test where it actually loaded and ran application scripts? It was THE benchmark for computer speed for many years and for good reason, it used actual applications. We need something like Winstone that actually times boot up time, loading applications, running an application script if it can be created to be storage based (otherwise just open and close the app), and then finally shut down the computer and save the results in a log file.

+one
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
I definitely agree, real world tasks tell the most about how the drive really performs. The only problem I have is coming up with real world tests that are easily reproducible and intensive enough to show differences between SSDs

The correlation of this statement is that in the majority of real world tasks there isn't a speed difference between SSD's.

Therefore perhaps benchmarking desktop SSD speed is now an obsolete measurement.

Remember when web sites used to benchmark USB speed?
 
Last edited:

Hulk

Diamond Member
Oct 9, 1999
5,155
3,769
136
The correlation of this statement is that in the majority of real world tasks there isn't a speed difference between SSD's.

Therefore perhaps benchmarking desktop SSD speed is now an obsolete measurement.

Remember when web sites used to benchmark USB speed?


There is definitely something to be said for the fact that the speed of most SSD's is so much faster than mechanical drives that it basically removes one of the last bottlenecks from the computer. But as with just about anything there will always be competition among manufactures to provide a faster, cheaper, more reliable product than their competitor. And to determine which is indeed better we need some type of testing.

As many tasks which used to take seconds have now been reduced to fractions of a second, its more about finding those applications and tasks that are most demanding on the disk subsystem and finding appropriate metrics of performance.

As I wrote above I would prefer a Winstone type test that begins with the computer booting, opens and closes applications, running disk intensive scripts in those applications which support them, and finally shutting down the PC. Sure manufacturers might begin to "tune" their drives to the test, but if it's a good test and representative to the workflow of many users then that's a good thing.

In addition, since we would have times or MB/sec ratings for each part of this test it would be relatively easy for end users to select drives which perform well in the tasks/applications they use most often.
 

Hellhammer

AnandTech Emeritus
Apr 25, 2011
701
4
81
First off, thanks everyone for the input! I was out of town for a few days and didn't have proper Internet access, hence the delay in my response.

Actually, I do not agree with that last statement. I see no urgency to choose only tasks that will differentiate SSDs. Sure, it is helpful to have some tasks that differentiate SSDs. But most important is to have a large selection of tasks that are of interest to people reading the review. If some of those tasks show that most SSDs are the same speed, then that is actually a useful result -- people will know that if those tasks are important to them, then they should select an SSD using criteria other than performance (quality, price, etc.)

I understand what you're saying but I still think testing boot time is a bit redundant, though on the other hand it's very simple to test. The biggest issue I see, however, is the fact that me and Anand have slightly different test setups. The drive performance is only one part of the total boot time (BIOS load time, number of devices connected etc. have an impact) so I don't think we could produce any reliable results.

I know the situation is not exactly ideal and having systems that are a perfect match would obviously be the best way. I hope to be able to solve this in the future when we upgrade our testbeds though (currently my testbed is also my main machine, definitely not ideal).

I think that it would be useful to just go through a common SSD computer install with a stopwatch.

Not a bad idea, but I don't think it's really needed because usually people do this only once. I'm also fairly certain that the ODD would be the biggest bottleneck, resulting in very similar scores for all SSDs. I wouldn't have a problem recording the install time if it wasn't so slow, cloning is just so much faster (again, I'm trying to create tests that are rather quick to run because these would most likely be in addition to our existing tests).

Have a script that launches several applications in parallel, some with documents to load, and record the time taken

Yeah, this is what I had in my mind too.

Record starting times and level loading times for several games (I'm not a gamer, so I cannot suggest specifics, but it should not be hard to find people to give suggestions for those tasks).

Also had this in my mind.

For video processing, just try some basic storage-bound tasks. For example, use MKVExtract and MKVMerge (part of MKVToolnix) to split the parts of a big video file. Then merge the parts back together. If you don't have a convenient MKV file, just use MakeMKV to rip a blu-ray to get your MKV file (probably you should have a couple sizes of MKV files, a smaller one for 128GB SSDs, maybe a DVD rip, and a larger one, bluray rip, for 256GB and 512GB SSDs).

Thanks for this one. I'll play around with it and see how it behaves.

I've never tried to do a Windows system image saved to the same disk that you are imaging, but if Windows will let you do that, it might be a decent test. More realistic would be to system image a separate disk to the SSD under test, but then you will need to keep a separate system disk around to boot from and to serve as the source for the test. Could also use the SSD under test as the source for a system image, and save to a fast HDD like a Velociraptor.

A separate system disk isn't a problem. I'll look into this.

I often find myself staring at the screen when I manually run a Windows update, after the files have been downloaded. Windows seems to take a long time to actually install the updates. I'm not sure if it is storage-bound, CPU-bound (lots of decompression?), or just software-bound (lots of pauses built-in to the program). I'm having a hard time thinking of a way you could make a repeatable test out of Windows update, though. I guess you could figure out where it saves its update files after it downloads them, and then keep a copy of some to use. But then you would be stuck with that version of Windows for the benchmark. Maybe this is one of the cases where a trace might be useful. After installing Windows from DVD, record a trace of Windows update running, then you can play that trace back as a benchmark for SSDs.

Windows updates are a big mystery to me as well, sometimes it feels like they take forever to complete. I think I'll need to use the same version of Windows anyway for the sake of accuracy (hence cloning would also be much easier). What I could do is to download the Service Pack 1 for Windows 7 and use it for benchmarking, that should work.

As you say, run a virus scan. Probably one of the last tasks you should run, so that the SSD is in a used state. Again, run it several times and report min, avg, max (since OS caches may speed things up on successive runs...maybe reboot between runs to minimize that effect).

That's what I thought too. There will also be more to scan after installation of all the apps and updates

The correlation of this statement is that in the majority of real world tasks there isn't a speed difference between SSD's.

Therefore perhaps benchmarking desktop SSD speed is now an obsolete measurement.

Well, you could say that about all benchmarks. CPU and GPU tests are also very specific because the fact is that the majority of tasks are limited by something that cannot be made any faster: User input speed. No SSD, CPU or GPU will make you type or use the mouse/trackpad any faster.
 

jwilliams4200

Senior member
Apr 10, 2009
532
0
0
I'm also fairly certain that the ODD would be the biggest bottleneck, resulting in very similar scores for all SSDs.

Spoken like a philosopher. :|

The purpose of testing SSDs is not to assume how they will behave, but to test it.

If all SSDs come out having similar times, then (as I said before) that is a useful result. People will know that for that task (and similar ones), SSD performance is not important.

But it needs to be tested. And Windows can be installed from DVD in significantly less than an hour.

You definitely need to keep consistent with the computer that is used for testing SSDs. A number of interesting tasks to test also depend on the CPU (especially decompression), so if you test SSDs on different systems, then you need to only compare SSD results among those tested on the same system.
 

Hellhammer

AnandTech Emeritus
Apr 25, 2011
701
4
81
If all SSDs come out having similar times, then (as I said before) that is a useful result. People will know that for that task (and similar ones), SSD performance is not important.

I don't see the point in doing tests where SSD performance is not important because it's just added workload for us that serves no real purpose. If SSD performance is not important in the task, then why should we use that task to test every SSD when in fact we would be testing some other component in our system?

We used to have tests like boot time but it becomes irrelevant when all SSDs perform about the same. The drive is no longer the bottleneck so testing it won't be useful because the next SSD won't be any faster at that task.