Overclocking TWO different 5820k. Similar problems.

Alpha0mega

Member
Aug 26, 2010
73
1
71
Late last year, I built a new system with the following specs:

i7 5820k
MSI X99S SLI Plus
Noctua NH-D15 (dual fans)
G.Skill DDR4 2400MHz 16GB
GTX 970 x 2 in SLI
Seasonic S-12D 850W (carried over from previous build)

I tried overclocking. I managed to hit 4.0GHz with almost no additional voltage. I then tried going to 4.4GHz at 1.2v (no boot). Increased to 1.25v (boot, but hang at desktop). At 1.27v, the desktop seemed stable, but Prime95's Small FFT test either produced near-instant errors in one or more of the test threads, or there was a hang. This happened with VDroop correction at any setting (auto to 100%). I thought that the processor just wasn't a great overclocker and settled for 4.0GHz at 1.1v.

Well, a few weeks ago by motherboard died, killing the CPU in the process. RMA'ed both, and got a MSI X99A SLI Plus (basically the same, but with two USB 3.1 ports) and a new i7 5820k (different batch no. and newer date of manufacture too). I tried oc'ing once again, thinking maybe the silicon lottery gods would be kinder this time.

Now I am finding the same limits. Prime95, Small FFT, 4.4GHz at 1.25v results is a very quick system crash. Went up to 1.3v, and within seconds cores started to cross the thermal limit (RealTemp/AIDA64), so I aborted the test. Going down to 1.27v resulted in quick errors in some of the test threads.

Interestingly, Tom's recently did an overclocking test of 5 retail 5960X processors, on a MSI X99S XPower AC with NZXT Kraken X41 cooler. This is their results*:

Hapnqvm.png


* Table has been created by me, for easier comparison. The original page is http://www.tomshardware.com/reviews/overclocking-retail-intel-core-i7-5960x-cpu,4237.html

This is on RealBench, for one hour, not Prime95. Still I am surprised that neither of my 5820k CPUs couldn't get to even 4.4 on 1.27v. I am aware that this is a very small sample size, but 40% of processors are getting to 4.5 at 1.2v (perhaps even 60%, if Sample 4 was able to get 4.5 at 1.2v).

Again, very small sample size, so probability can't be reliably calculated, yet my two CPUs can't even get to 4.4GHz! Tom's has a better mobo, but I have a better cooler and 5820k should run cooler as well as having two less cores that could be the weak links.

Am I just unlucky or doing something wrong?
 
Last edited:

MagickMan

Diamond Member
Aug 11, 2008
7,460
3
76
Don't test a Haswell-E w/ Prime95, ever. Many people have damaged their mobos and/or CPUs that way, even after only 15-20 minutes. Nothing in the real world is going to hit your CPU that hard, stick with Realbench or Aida64 on long tests, >6 hours, to verify stability under load. You very well may have damaged your first mobo and CPU w/ Prime95 the first time, the effects of it simply took some time to materialize.

I don't have experience w/ the 5820k but my 5930k takes 1.28v to reach 100% stability @4.4GHz and 1.3v for 4.6 (and that's where I draw the line on voltage, to minimize electromigration before I'm ready to upgrade again in a few years). It won't do 4.7GHz even at 1.33v.
 

Alpha0mega

Member
Aug 26, 2010
73
1
71
Damn! Never heard of Prime95/Haswell-E problem. Is that a confirmed thing? Any links? I have already run the Prime95's Small FFT test on the new board/CPU for around 2 hours, and what you said is making me nervous. Now if something even a little off happens with the system, it will have me wondering if this is the day it dies, in perpetuity.

For AIDA64, in the System Stability Test, should I just test the CPU or the FPU and/or cache as well?
 
Last edited:

MagickMan

Diamond Member
Aug 11, 2008
7,460
3
76
http://www.overclockers.com/forums/...-CPU-s-and-you-is-it-bad-for-the-CPU-s-health

Essentially it hammers your VRMs, and while okay to use at stock voltage, in overvoltage situations it will eventually damage your VRMs, motherboard socket, and in turn, your CPU, due to the massive amount of heat generated. Now, if you're watercooling everything, including a plate for your VRMs, then you may be fine, but most people don't. Either way, I don't think it's worth the risk and tell people to just stick with Realbench, Aida64, OCCT, and Cinebench.

A great test for extreme loads is OCCT's Linpack (w/o AVX), 30 minutes of that will shake loose nearly anything related to CPU stress failures.
 

zir_blazer

Golden Member
Jun 6, 2013
1,239
537
136
AVX for Haswell-E is different than consumer Haswell. Haswell overvolts and maintains Frequency, Haswell-E instead has a different table with lower values of base and turbo Frequencies during AVX loads. Haswell-E is absolutely better in that regard, sad that I don't know if Broadwell and Skylake do it that way, too.


Don't test a Haswell-E w/ Prime95, ever. Many people have damaged their mobos and/or CPUs that way, even after only 15-20 minutes. Nothing in the real world is going to hit your CPU that hard, stick with Realbench or Aida64 on long tests, >6 hours, to verify stability under load. You very well may have damaged your first mobo and CPU w/ Prime95 the first time, the effects of it simply took some time to materialize.
That is ridiculous. The fact that nothing that a typical consumer could use would generate such stress, doesn't means that its tolerable that letting an application run can actually cause damage your system. Prime95 is not the only power virus. What if instead of the Bitcoin GPU mining viruses like we already had, the next wave includes a CPU miner that uses AVX? You get the idea: Its not typical, but its possible.
If it not only fails, but gets damaged during stress in the possible usage scenarios at default values, there is something seriously wrong. LGA 2011 is supposed to be serious business, that would be untolerable on a Xeon E5.
 

Alpha0mega

Member
Aug 26, 2010
73
1
71
Thanks for the link. I hope I got lucky and didn't harm my system with the Prime95 test. At 4.4GHz at 1.235v, I successfully completed a 2 hour RealBench stress test. Failed it at 1.22v, so the sweet spot should be somewhere between that. I wasn't able to get 4.5 at 1.25v. Will do some more testing.
 

YBS1

Golden Member
May 14, 2000
1,945
129
106
That is ridiculous. The fact that nothing that a typical consumer could use would generate such stress, doesn't means that its tolerable that letting an application run can actually cause damage your system. Prime95 is not the only power virus. What if instead of the Bitcoin GPU mining viruses like we already had, the next wave includes a CPU miner that uses AVX? You get the idea: Its not typical, but its possible.
If it not only fails, but gets damaged during stress in the possible usage scenarios at default values, there is something seriously wrong. LGA 2011 is supposed to be serious business, that would be untolerable on a Xeon E5.

A Xeon E5 can't be overclocked by more than a gigahertz, you're comparing oranges to baseballs. If you are going to be using something that is absolutely intent on hammering the AVX extensions then you would be best off running at Intel validated speeds, then it's on them.

He goes into greater detail over on the ROG forums but from the link above:
"The newer versions of Prime load in a way that they are only safe to run at near stock settings. The server processors actually downclock when AVX2 is detected to retain their TDP rating. On the desktop we're free to play and the thing most people don't know is how much current these routines can generate. It can be lethal for a CPU to see that level of current for prolonged periods." -Raja
 
Last edited:

zir_blazer

Golden Member
Jun 6, 2013
1,239
537
136
A Xeon E5 can't be overclocked by more than a gigahertz, you're comparing oranges to baseballs. If you are going to be using something that is absolutely intent on hammering the AVX extensions then you would be best off running at Intel validated speeds, then it's on them.
Xeon E5 has a very slim margin of overclock. You don't have an Unlocked Multiplier to play with, maybe only forcing max Turbo via Multi Core Enhancement. And since everything depends on the Base Clock, you don't have a lot of freedom to raise it without getting issues somewhere else (AHCI Controller, PCIe Bus, etc). I think the only guy that managed to get anywhere with a Xeon E5, was XtremeSystems FUGGER with the E5 2698V3.

He goes into greater detail over on the ROG forums but from the link above:
"The newer versions of Prime load in a way that they are only safe to run at near stock settings. The server processors actually downclock when AVX2 is detected to retain their TDP rating. On the desktop we're free to play and the thing most people don't know is how much current these routines can generate. It can be lethal for a CPU to see that level of current for prolonged periods." -Raja
This is what I said early about Haswell and Haswell-E. I suppose than the downclock should apply to LGA 2011 Core i7s since they're, after all, the same thing, so I take that "on the desktop" means LGA 1150 Haswells.
 

YBS1

Golden Member
May 14, 2000
1,945
129
106
This is what I said early about Haswell and Haswell-E. I suppose than the downclock should apply to LGA 2011 Core i7s since they're, after all, the same thing, so I take that "on the desktop" means LGA 1150 Haswells.

I don't know how they would behave given a person run them under "setup defaults", it's possible they would do the same? HEDT enthusiast boards pretty much allow us to run amok though and defeat many of the safety margins Intel built in.
 

Alpha0mega

Member
Aug 26, 2010
73
1
71
Well, my replacement board is dead! Same problem as before. Don't know if it was because of Prime95 or not, though I had run it far less than I had on the first board, and the original board lasted for 9 months after the initial setup and testing, while this one was on for less than 10 hours (only 2 of those were testing Prime).

Uncertainty notwithstanding, I am never touching Prime95 again.
 

MagickMan

Diamond Member
Aug 11, 2008
7,460
3
76
Ouch. Yeah, that's very possible. Avoid any stress test that includes AVX extensions, you don't particularly want >300W pulled through your CPU socket.
 

jj109

Senior member
Dec 17, 2013
391
59
91
At this point I'm thinking it's your power supply or some setting you touched in the BIOS that's killing your equipment, and Prime95 is just a big red herring.

I can see high AVX2 load damaging the FIVR if the input voltage is held too high, but killing a 12-phase VRM or burning a >2000 pin socket that's mostly power/ground? If that's the case, our GPUs should be melting every day.

But do get another replacement... I'm morbidly curious.
 

Alpha0mega

Member
Aug 26, 2010
73
1
71
I am getting a new power supply, even though I doubt my current one is faulty. As for the BIOS settings, the only changes I made were setting the multiplier, setting the voltage to adaptive and changing the core voltage. I had played around with LLC/Vdroop on my first mobo (the one that lasted 8 months).

For the current one mobo that died, I set multiplier to 40 and voltage at 1.1v, then ran Prime for 1hr 45m without problem. Pushed up to 44 at 1.25v, Prime95 threw errors after a few mins, then tried 1.299v (just out of curiosity, anything over 1.25v isn't worth it for me). The last led to the cores quickly starting to hit 100c, so I aborted the test.

Then posted this thread, and based on the replies, started using RealBench/AIDA64 to test my system. Got it stable (2 hr RealBench stress test) at 4.4GHz at 1.235v. Ran some 3DMark Firestrike tests too. Shut down for the day, and found the system dead the next day when I powered on.

Also, the X99 SLI Plus is a 8 phase board.
 

Soulkeeper

Diamond Member
Nov 23, 2001
6,732
155
106
Lately i've been wondering about the impact of cpu package trace cross section and the decoupling caps on modern CPUs. 5820K didn't get the same rework in this area as devil's canyon ?

Here is an example for what i've been thinking about
3630Fig03.gif


Just going by this alone it would seem highly likely that a package be tuned for specific frequencies. In addition to capacitor values, add trace cross section size, length, current, and temp and things get very complex.
Also, i've learned long ago not to expect to reach the clocks/speeds seen in reviews. More often then not they just boot and take a screenshot, it's not like they are using the systems daily/weekly at the settings they test.
 
Last edited:
Nov 26, 2005
15,189
401
126
http://www.overclockers.com/forums/...-CPU-s-and-you-is-it-bad-for-the-CPU-s-health

Essentially it hammers your VRMs, and while okay to use at stock voltage, in overvoltage situations it will eventually damage your VRMs, motherboard socket, and in turn, your CPU, due to the massive amount of heat generated. Now, if you're watercooling everything, including a plate for your VRMs, then you may be fine, but most people don't. Either way, I don't think it's worth the risk and tell people to just stick with Realbench, Aida64, OCCT, and Cinebench.

A great test for extreme loads is OCCT's Linpack (w/o AVX), 30 minutes of that will shake loose nearly anything related to CPU stress failures.


Sorry to hijack this thread but what about IBT & Prime 95 on a Xeon Westmere W3690 1366 socket?
 
Last edited:

MagickMan

Diamond Member
Aug 11, 2008
7,460
3
76
I doubt that they can be OCed to levels that cause this issue, but I could be wrong.