A tool for rigorously testing an SSD

avinatbezeq

Junior Member
Aug 4, 2009
2
0
66
I have a Samsung 960 Pro 1TB SSD which caused me problems running a VMware Workstation Pro : The guest would occasionally freeze (started rarely, then about twice a day). Replacing the 960 with a 970 Pro solved this problem.

Now, the supplier has run some tests on the SSD, claimed it's fine, and won't respect the warranty.

I'm looking for a tool that can deep test the SSD and provide support for my claim that the SSD is faulty. Any ideas
 

fralexandr

Platinum Member
Apr 26, 2007
2,244
188
106
www.flickr.com
There are various tools that give you detailed hard drive stats, which include wear data, errors, and projected reliability/lifespan. I've personally used speccy and speedfan.


If you're running windows, you can also use chkdsk in command prompt to see if there are any corrupted sectors/files
 
Last edited:
Feb 25, 2011
16,788
1,468
126
The drives will keep track of things like I/O errors in their SMART data, and Windows keeps track of I/O problems in the Event Viewer logs. That's the data you'd need to establish that the drive was messing up, and should be sufficient to get a warranty claim approved. But I/O timeouts are also very commonly associated with other things too (like bad cables.) So maybe the SSD wasn't the problem.

If you replaced the boot drive in your system, you presumably reimaged or re-setup the OS, reinstalled VMWare, new drivers, etc. There are a lot of other potential issues that could have resolved.

If you just unload the parts cannon, you'll often accidentally fix problems without really knowing the root cause; that doesn't obligate vendors to subsidize the parts cannon.
 
Last edited:
  • Like
Reactions: avinatbezeq

avinatbezeq

Junior Member
Aug 4, 2009
2
0
66
The drives will keep track of things like I/O errors in their SMART data, and Windows keeps track of I/O problems in the Event Viewer logs. That's the data you'd need to establish that the drive was messing up, and should be sufficient to get a warranty claim approved. But I/O timeouts are also very commonly associated with other things too (like bad cables.) So maybe the SSD wasn't the problem.

If you replaced the boot drive in your system, you presumably reimaged or re-setup the OS, reinstalled VMWare, new drivers, etc. There are a lot of other potential issues that could have resolved.

If you just unload the parts cannon, you'll often accidentally fix problems without really knowing the root cause; that doesn't obligate vendors to subsidize the parts cannon.

Good thinking.

However, this was not my boot drive. I had upgraded VMware as well as the drivers and Samsung magician PRIOR to replacing the SSD trying to locate the problem. Nothing helped. Replacing the SSD (using, AFAIK, the same cable) solved this problem.

B.T.W, English is not my first language. What does "unload the parts cannon" mean?
 
Feb 25, 2011
16,788
1,468
126
Good thinking.

However, this was not my boot drive. I had upgraded VMware as well as the drivers and Samsung magician PRIOR to replacing the SSD trying to locate the problem. Nothing helped. Replacing the SSD (using, AFAIK, the same cable) solved this problem.

That does make the drive a pretty likely culprit. I'd still be curious what, if anything, Event Viewer showed while VMWare was running w/ the old drive, though.

B.T.W, English is not my first language. What does "unload the parts cannon" mean?

It's a phrase common in auto repair - basically means replacing a bunch of stuff that might be the problem without being sure what actually is the problem, in the hopes that the problem will go away anyway.

There are pros and cons to the approach: if your time is expensive and your parts aren't (as is often the case with computers), it makes good sense. But you never get the satisfaction of actually knowing beyond a shadow of a doubt what the problem really was. :)

It's frowned upon in auto repair (really a bit of an epithet) because car parts are frequently expensive, and mechanics that rely on it are assumed to not actually know how to diagnose.
 
Last edited:

aigomorla

CPU, Cases&Cooling Mod PC Gaming Mod Elite Member
Super Moderator
Sep 28, 2005
20,841
3,189
126
Dave the 960 and 970 are NVme's
They run on the PCI-E.
There is no real way he can check health of said drives except possibly crystaldisk, and even then its not 100% accurate.
Also if a NVMe has issues its very very difficult to debug, unless its a major failure like wont even register, as it can be a multitude of issues.

1. CPU
2. Board
3. PCI-E interface on either the board or on the NVMe itself.
4. RAM
5. NVMe controller...

Any one of these will shoot out errors, and it might not actually be the NVMe itself.

Ive even had NVMe's have slight issues because of a Bent CPU pin in the socket, or a bad aligned CPU that required remounting.
These things are about as nit picky as RAM sometimes.

lol....

The only way your going to prove something is wrong with the SSD is by trying to find something reproducable, and basically recording via video and submitting it to RMA.

This isn't a magnetic drive which will shoot out unreads and uncorrectables.
 

Billy Tallis

Senior member
Aug 4, 2015
293
146
116
There is no real way he can check health of said drives except possibly crystaldisk, and even then its not 100% accurate.
The only way your going to prove something is wrong with the SSD is by trying to find something reproducable, and basically recording via video and submitting it to RMA.

There are fewer things that can go wrong with NVMe drives than with SATA drives, and fewer layers of abstraction involved. You post is impressively wrong. As long as you aren't trying to debug issues with a NVMe drive using software that's 5+ years old and hasn't been updated to support anything other than SATA, it's not going to be any harder to figure out what's wrong with a NVMe drive than to do the same with a SATA drive.
 

aigomorla

CPU, Cases&Cooling Mod PC Gaming Mod Elite Member
Super Moderator
Sep 28, 2005
20,841
3,189
126
There are fewer things that can go wrong with NVMe drives than with SATA drives, and fewer layers of abstraction involved. You post is impressively wrong. As long as you aren't trying to debug issues with a NVMe drive using software that's 5+ years old and hasn't been updated to support anything other than SATA, it's not going to be any harder to figure out what's wrong with a NVMe drive than to do the same with a SATA drive.

This is so false..
Again, on a NVMe drive since it uses the PCI-E you have a more things that can go wrong, then a SATA which uses just the sata bus.

A CPU will not corrupt a SATA drive, while it can corrupt a NVMe, by corrupting the PCI-E lane.
Motherboard BIOS's can also fubar a NVMe, which has been common in some cases, as the board needs to support m.2

SATA SSD either works or it doesn't, its that simple with a SATA SSD.
Either the NAND is dead, or the controller is dead, or all your SATA ports are dead.

There is no dubugging the board PCI-E, resitting the CPU, or FLASHING a new bios which a NVMe can be.
There is also no making sure you use the proper M.2, as if your cpu can not handle the lanes so your PCI-E slots are deactivated or run at which slots are deactivated when you run a PCI-E adapter.

The PCI-E lane which you are either limited to begin with, you can only debug it by moving said NVMe onto another M.2 slot or getting a PCI-E adapter for, vs a SATA which u can just move it to another port as all the ports are linked under most cases.

NVMe's are also prone to overheating vs again a SATA SSD.
960 PRO's are notorious for overheating and going thermal Throttle.

And there is no software other then crystalmark which will give you a very big grain of salt on the condition of both SATA SSD and NVMe, because you cant test sectors like you could on a spinner, as you will eat up your writes.

Even Vendor diagnostic hardware is flawed again, because again, if your problem lies at the root, meaning your PCI-e lanes, it will spit out errors.

A SATA SSD is far more easier to diagnose as its far easier to move to another system to test then a NVMe.

If you think softwares are accurate, why not share which ones the OP wants so he can see if it shows anything wrong with his 960.
 
Last edited:

Billy Tallis

Senior member
Aug 4, 2015
293
146
116
Again, on a NVMe drive since it uses the PCI-E you have a more things that can go wrong, then a SATA which uses just the sata bus.

SATA drives use a SATA link to an HBA (usually in the chipset) which then uses PCIe to communicate up to the CPU.

A CPU will not corrupt a SATA drive, while it can corrupt a NVMe, by corrupting the PCI-E lane.

Any issue with PCIe is just as likely to affect the lanes going to your NVMe SSD as the lanes going to your SATA controller.

Motherboard BIOS's can also fubar a NVMe, which has been common in some cases, as the board needs to support m.2

I'm not sure what you mean here. If a motherboard's firmware doesn't include NVMe support, it will simply not boot off a NVMe drive, but it won't damage the drive or any data stored on it.

There is no dubugging the board PCI-E, resitting the CPU, or FLASHING a new bios which a NVMe can be.
There is also no making sure you use the proper M.2, as if your cpu can not handle the lanes so your PCI-E slots are deactivated or run at which slots are deactivated when you run a PCI-E adapter.

This bit's rather incoherent, but I think the only real issue you're referring to can be solved by reading the motherboard manual to see which PCIe lanes are routed to which slots.

The PCI-E lane which you are either limited to begin with, you can only debug it by moving said NVMe onto another M.2 slot or getting a PCI-E adapter for, vs a SATA which u can just move it to another port as all the ports are linked under most cases.

I'm not seeing a difference here. If a drive doesn't work with one connector on the motherboard, you try a different one. That's just as true of NVMe as SATA. The only complication is that you might end up buying a cheap passive PCIe to M.2 riser so that you can test a NVMe drive in the slot that you know works because you usually use it for your GPU.

NVMe's are also prone to overheating vs again a SATA SSD.
960 PRO's are notorious for overheating and going thermal Throttle.

Most NVMe SSDs when thermally throttling are still faster than SATA SSDs, and it takes very contrived circumstances to get a NVMe SSD to overheat enough for it to shut down entirely.

And there is no software other then crystalmark which will give you a very big grain of salt on the condition of both SATA SSD and NVMe, because you cant test sectors like you could on a spinner, as you will eat up your writes.

CrystalDiskMark is a shitty tool for people who want quick and dirty numbers to compare without understanding what they're doing. It's basically at the bottom of the list of software tools I'd recommend for any kind of troubleshooting. And again, it sucks equally as a troubleshooting tool for SATA and NVMe SSDs.