I may need to level up on RAM troubleshooting

mikeymikec

Lifer
May 19, 2011
21,007
16,259
136
In my experience, RAM problems tend to manifest in either:

1 - Nice and simple "refuses to boot / POST"
2 - Random crashes/freezes (Windows, apps, whatever)

A pass (or several) of say memtest86/+ is usually enough to narrow down the problem to a particular module.

On my own computer, I've had lingering stability issues for some time. One big cause has been identified being my old GeForce 750Ti (the main issue with that was intermittent issues coming out of S3 sleep, narrowed down by swapping it out for an older 6670 for a few months). However, when I wanted to play XCOM2 without wanting to stab myself in the eye repeatedly, I decided to upgrade to a 380X.

I've experienced a few crashes with XCOM2, most notably on the Avenger defence mission (it was the only mission that I could expect a crash and that I must remember to save often). My thinking was that if it consistently happens on a particular mission then it stands to reason that it's the game, not my hardware. XCOM2 has been completed a couple of times and I've moved on to The Witcher 3. I bought it in April and have had seven 'crashes' (the game app and audio continues to run but Windows drops back to the desktop and nothing happens if I try to alt-tab into the game) by the looks of things, and in the entry in reliability monitor on each occasion has been 'video hardware error'. I've updated graphics drivers a few times during the course of playing XCOM2 and W3.

My thinking came back around to the memory because (as a result of an unrelated issue) I noticed once again that the BIOS reckons that two of my modules are DDR3-1333 and the other two are 1600. The older set (the ones it reckons are 1333) is in fact DDR3-1600 (and I set the BIOS to run them at 1600 myself). They're all Kingston HyperX 1600 modules, and I don't have any stability issues outside of games, ever (ever since getting rid of the 750Ti anyway). The '1333' set are identified on the modules as 1600, however they were sold as 1.65v modules (on the module itself it reckons 1.7 - 1.9v!). I'm just really surprised that the only time I've had any problems with these modules at 1.5v are in modern games. I've removed the those modules as I can do without them (the remaining two gives me 8GB RAM, the older modules gave me a total of 4GB). I removed the older modules on the morning of the 5th of May and I haven't had any issues in W3 since. CPU-Z reckoned that all the modules could be run at 1.5v. Confusion reigns.

However, as a result of all the troubleshooting leading to the removal of the 750Ti, all the modules have seen a lot of memtest86 passes, prime95 work, etc. I had also done a lot of testing of that card without the '1333' modules.

I just feel that I need a better memory testing method than "does a particular game crash", because unless I have a tonne of testimonies to say "I've played this game for a hundred hours with the 380X and it hasn't crashed once", then how am I supposed to have faith that it's not the game's fault?
 

Sheep221

Golden Member
Oct 28, 2012
1,843
27
81
You won't develop better testing method.
I had once an AMD-based Gigabyte motherboard[GA-MA790X-DS4], which always caused wow to crash and restart PC when specific graphic effects were enabled, only replacing the entire board with completely different one with different chipset, fixed the issue. Some things just do not mix together especially in gaming, being solely on software level, hardware level or combo of these, the chance you get hit by compatibility issues is always there.
 

Dasa2

Senior member
Nov 22, 2014
245
29
91
it seems to me that memtest is fairly good at finding ram errors but it has more difficulty with memory controller errors
sometimes games will just find the problem quicker than any stress tests
 

BonzaiDuck

Lifer
Jun 30, 2004
16,630
2,026
126
I've seen the havoc wrought by faulty RAM sticks. I buy G.SKILL, might consider Corsair because a 16GB 4x4 kit came into my possession. G.SKILL kits always seem to configure easily to their spec, and they can be set to run at CMD=1 with at most tweaking the VCCIO/IMC by a few millivolts -- literally -- if needed at all. You just leave the vDIMM voltage set to spec, and use the spec timings under an XMP profile.

I've had G.SKILL sticks go bad; I'd had many more Crucial Ballistix fail.

So in an era where many feel compelled to build systems with 32GB of RAM, my preferred first choice is to get 16GB 2x8 into two slots. Second choice would be 16GB 4x4 in four slots. But now I want at least another 2x2GB kit for the remaining two slots of four -- making at least 20GB RAM overall. I chose to add a 2x4GB kit instead.

The problem we face with this -- wouldn't know how speed attenuates the inconvenience with Haswell or Skylake systems -- is the time it takes to do a thorough test.

I've got two choices: (a) remove the 16GB kit and test the 8GB kit separately, or (b) add the 8GB kit and test all 24GB together. (a) is troublesome because I have a broken latch on a motherboard DIMM socket and need to leverage the RAM out of the slot with a tool -- like a screwdriver. (b) is preferred for not having to remove any RAM.

(b) is still a bitch because a 1000% coverage test with HCI Memtest 64 will take maybe 4 days with 16GB, and longer with 24GB.

What I do in a situation like this is to set my final choice of timings (mostly just CMD=1) from the beginning. If I tested the 16GB for 500% coverage initially, I can do OK with a 300% coverage test on 24GB.

It's still going to take a couple days. OR -- you run the HCI MEmtest 64 program's Windows implementation under the OS while you do your "other work." I'd rather just run the boot-CD/DVD of HCI Memtest outside of Windows before feeling comfortable running Windows at all.