bntran02

Member
Jun 7, 2011
87
1
66
So...lately my computer has started to crash. The symptom is a blue screen given about an unrecoverable error that will randomly happen. When it happens, it happens every time I reboot. It is the same or similar screen I get when I overclock the memory or CPU too much.

My System:
ASRock X99 Extreme 4
16GB x 4 Corsair DDR4 Dominator ram sticks, stock speeds
Intel 5820K CPU @ 1.275V, 4.3Ghz
Corsair HX something watercooling
Other stuff...

So, obviously, overclocking will cause the issue but it still occurs if I run everything at stock.

The issue seems to resolve itself temporarily (for a few months) if I raise my DRAM voltage. I raised from 1.2V to 1.3V and it works. Eventually, it stops working and I have to raise it again. At this point, it is at 1.38V and still not stable. Immediate crash after windows bootup and login.

If I remove a single ram stick then everything works fine again (at least for now, not sure for how long). It doesn't matter which stick or which slot I remove it from. Re-ordering the RAM sticks also doesn't matter. It also doesn't matter which voltage I run it at. 1.2V or 1.38V. If I am using 1-3 sticks then it works fine. I do not have other ram sticks brands to try.

So at this point, the theory is that the motherboard is the issue and a component has slowly failed that controls the DRAM voltage. Having more ram sticks would increase the load preventing some voltage regulator from working. Raising the voltage would temporarily fix the issue until it "fails harder".

Anything else I can try? At this point I have not spent money trying to fix it but think a new motherboard is the only solution. Of course, I can just call it a first world problem and run with 48GB of ram and see if it ever fails. That's way more than I typically use anyways.
 

bntran02

Member
Jun 7, 2011
87
1
66
Listen to XM. You're trying to isolate the issue. Download and run MemTest. Test one stick of ram at a time.

I did the inverse of that. Instead of one stick at a time, it was 3 sticks at a time. Each memory stick took turns being "out". Then a memtest was ran for hours during each combination. Every combination was stable except when 4 sticks of ram was used.
 

ch33zw1z

Lifer
Nov 4, 2004
37,759
18,039
146
Drop the overclock, memtest86+ each stick for 24 hours by itself. Make sure your BIOS is at latest. Check CPU socket for bent pins, inspect each memory slot for bent pins
 

bntran02

Member
Jun 7, 2011
87
1
66
When you say stock speeds, you mean you're running your memory at 2133mhz, correct?
Yes, at 2133mhz. It's actually rated for 2666Mhz so it's already downclocked. I tried downclocking further and increased latency numbers. No difference.

Drop the overclock, memtest86+ each stick for 24 hours by itself. Make sure your BIOS is at latest. Check CPU socket for bent pins, inspect each memory slot for bent pins
  • Dropping overclock on CPU was the first thing I did. No difference.
  • BIOS is latest. Second thing did. No difference.
  • Removing memory sticks is the 3rd thing I did. I slowly removed one stick at a time. Only thing I have not done is a 24hour test for each memory stick by itself. I ran several memtests on several combinations of 1-4 sticks. About 10 different combinations. In trying to isolate the problem, the only pattern I can make out is the system crashes immediately after the login screen after inserting a fourth stick. I also tried different ram slots too. Each test was not for 24 hours, but for several hours.
  • I have not checked if CPU socket pins are bent but isnt it extremely unlikely? It's been running for years without being removed. I dont see how it can become bent while under the cooling block.
  • I also physically checked the memory. Looks good.
 

bigboxes

Lifer
Apr 6, 2002
38,574
11,968
146
Yes, at 2133mhz. It's actually rated for 2666Mhz so it's already downclocked. I tried downclocking further and increased latency numbers. No difference.


  • Dropping overclock on CPU was the first thing I did. No difference.
  • BIOS is latest. Second thing did. No difference.
  • Removing memory sticks is the 3rd thing I did. I slowly removed one stick at a time. Only thing I have not done is a 24hour test for each memory stick by itself. I ran several memtests on several combinations of 1-4 sticks. About 10 different combinations. In trying to isolate the problem, the only pattern I can make out is the system crashes immediately after the login screen after inserting a fourth stick. I also tried different ram slots too. Each test was not for 24 hours, but for several hours.
  • I have not checked if CPU socket pins are bent but isnt it extremely unlikely? It's been running for years without being removed. I dont see how it can become bent while under the cooling block.
  • I also physically checked the memory. Looks good.

Although you are probably going in the right direction, you want to test each module separately just as SOP in the troubleshooting process. Sounds like a motherboard issue or possibly the IMC of the CPU. I am leaning toward a bad mobo. When I built the rig in my sig, I found (through troubleshooting) that I had some bad ram slots. After I exchanged my mobo, I found that I had a couple of bad memory modules. Talk about bad luck. Unfortunately, it was past 30 days and I couldn't exchange the ram. Luckily, I was able to RMA to G.Skill without issue. Let us know how this all turns out for you.
 

bntran02

Member
Jun 7, 2011
87
1
66
As a pc builder, you will end up with spare parts. It's inevitable.
After ~20 years of building PC's at some point there is a conflict of interest between your wife's desire to declutter and your own desire to hold on to what is dear to you. Happy wife, happy life. So....
 
  • Like
Reactions: VirtualLarry

ch33zw1z

Lifer
Nov 4, 2004
37,759
18,039
146
No doubt lol....I've been with my Mrs for 15 years, and I definitely don't bring stuff home or buy stuff I don't need, I know the calamity that ensues.

I do however keep spare parts. I have a rack of storage. Used to be 2...but you guys know how that goes.
 
  • Like
Reactions: VirtualLarry

bntran02

Member
Jun 7, 2011
87
1
66
So I ran a bunch of mem86+ tests...The full test was ran until completion. 0 errors. Slot names below are based on the manual labeling for the 8 ram slots.

1 stick, slot A1
1 stick, slot B1
1 stick, slot C1
1 stick, slot D1
4 sticks, slots A1-D1

All tests pass. But windows crashes only on the 4 sticks test.

All other ram slots cause 3 bios beeps, repeating. I havent looked into the meaning yet. But really strange. This is the first board i've had with more than 8 slots. Maybe a configuration issue or Slots A1-D1 need to be populated first? I'm looking into it...

The scenarios tried was:
4 sticks, slot A2-D2
3 sticks, slots B2-D2
2 sticks, slots C2-D2
1 stick, slot D2
 

bntran02

Member
Jun 7, 2011
87
1
66
I
Is the memory you are running on th supported list?
www.asrock.com/mb/Intel/X99%20Extreme4/?cat=Memory
Are you running the memory sticks in the suggested slots?
What BIOS version are you running (three versions are listed as improving memory compatibility)?

well, looks like my ram is not in the list. I honestly have never looked at ram compatibility lists as it has always worked no matter what. I am running the later bios unless there was an update in the last week or so.

Still doesn't explain why it was bulletproof for a couple of years and slowly got worse
 

Ketchup

Elite Member
Sep 1, 2002
14,545
236
106
well, looks like my ram is not in the list. I honestly have never looked at ram compatibility lists as it has always worked no matter what. I am running the later bios unless there was an update in the last week or so.

Still doesn't explain why it was bulletproof for a couple of years and slowly got worse

Two things I can think of:
1. Hardware degredation
2. A Windows update. What version of Windows are you running? I looked through the thread and didn't see it mentioned. Remember that Windows 10 will automatically install it's own driver updates, separate from Windows Update, by default. This option can, thankfully, be turned off.
 
Last edited:

kirbyrj

Member
Aug 5, 2017
122
27
61
Could be your memory controller on CPU is getting spotty after years of OCing?

How long did you run memtest? 1 hr? 8 hrs? 24 hrs?
 

bntran02

Member
Jun 7, 2011
87
1
66
Could be your memory controller on CPU is getting spotty after years of OCing?

How long did you run memtest? 1 hr? 8 hrs? 24 hrs?

Basically every night for a week.
8 hours overnight per ram stick and stopped only because it completed.
16 hours overnight with all 4 sticks installed. I ran this test twice.

I've been leaning towards the motherboard since the beginning and just been trying to prove otherwise. I guess CPU is a possibility. But I've also ran intel burnin tests for hours with all 4 sticks. I ran this every time that I had to increase the RAM voltage to make sure it was stable.

Something with a windows update is also a possibility. Windows always boots. It only crashes after logging in. But increasing the voltage for stability also conflicts with the windows theory...
 

Ketchup

Elite Member
Sep 1, 2002
14,545
236
106
Traditionally bumping up RAM voltage doesn't cause any type of degredation like it would with a cpu. And with that many RAM slots full I would plan on running above your base level at all times. Now, I haven't heard the voltage discussion since DDR II and DDR III days, so I can't tell you with certainty that DDR 4 is as resilient.
 

bntran02

Member
Jun 7, 2011
87
1
66
Traditionally bumping up RAM voltage doesn't cause any type of degredation like it would with a cpu. And with that many RAM slots full I would plan on running above your base level at all times. Now, I haven't heard the voltage discussion since DDR II and DDR III days, so I can't tell you with certainty that DDR 4 is as resilient.

To clarify, the ram voltage was gradually bumped up from 1.2 to 1.38 over the course of about a year. But 1.2v was used during all testing.