How does everyone evaluate stability now?

pantsaregood

Senior member
Feb 13, 2011
993
37
91
The "stability testing" thread comes from a simpler time when AVX didn't cause CPUs to reach ridiculous temperatures. In 2011, I'd throw Linpack at my 2500K for a few hours and be done with it.

Doesn't seem quite that simple any more.

I have a 6700K and a 6800K that need to be verified for stability. AVX-capable Linpack makes them get about 15°C warmer than running Linpack without. Prime95 28.10 actually pushes them right up to 94°C, whereas Prime95 26.6 only drives them to about 70°C.

I understand that some newer programs (Intel XTU, Asus Realbench) don't produce absurd amount of heat like heavy AVX loads in Prime95 do. Are these programs useful for determining overall stability, or should I stick to running Prime95?

I'm under the impression that Prime95 just throws AVX/AVX2 instructions at these CPUs nonstop. I want AVX to be stable, but it appears that constant AVX instructions almost thermally limit the CPUs at stock speeds, much less at heavy overclocks.
 

superstition

Platinum Member
Feb 2, 2008
2,219
221
101
This is one of the drawbacks of Intel's use of polymer TIM.

If you will not encounter a real-life workload that uses AVX2 heavily (long duration, full loading of the cores) then you can skip it. Otherwise, you're not truly testing for stability by skipping it.

I've heard that RealBench has shortened its tests, making them less useful as a stress test.

I undervolted my 6700K and was stable in the latest Prime without 90C heat. But the trade-off is that I used a static voltage undervolt rather than messing with offsets.
 

Hi-Fi Man

Senior member
Oct 19, 2013
601
120
106
I ended up using a combination of prime 95 26.6 and asus realbench to test my 4790K for 4.6GHz. For many years I used IBT but running IBT on the chip increased temperatures into 90s almost immediately if I was over 4.4GHz. It was at this point I cursed Intel for using TIM instead of solder on a K series chip and prayed for Zens success...

Interesting that your 6800K heats up like that too because that chip is soldered. It must be an architectural thing.
 

Flapdrol1337

Golden Member
May 21, 2014
1,677
93
91
When I overclocked this machine I set the power limiter in bios to a value that it doesn't reach with the old prime95's (before they added avx2). I also test with newer prime running fewer threads (thread bounces between cores and keeps temps down, doesn't exceed the power limit) and with LinX (starts and stops over and over).

I figured if it doesn't crash then it shouldn't crash in "real" software either. And the power limiter prevents a meltdown if you were to run some avx2 enabled transcoding software or something.
 

RichUK

Lifer
Feb 14, 2005
10,320
672
126
I think it just comes down to what your intended use of the processor is to best determine which version of Prime95 to test with.

28.10 just hammers the processor with FMA3. I believe 27.9 uses AVX not AVX2, so that is the version I was comfortable with. The newer versions required more voltage to stabilise a given frequency with the by product being more heat output to manage.
 

The Stilt

Golden Member
Dec 5, 2015
1,709
3,057
106
Linpack is a poor stability test, as it spends ~20% of the total problem solving time in a low stress state. Also it checks for errors just once per loop, meaning you are potentially (in case of instability) wasting some serious time if you are using sufficient problem sizes to begin with. Also you obviously need to have access to ample amounts of DRAM in order to use the larger problem size to begin with.

IMO Prime95 is completely unrivaled as a stress test. The stress levels it produces obviously do not represent a typical fully multithreaded modern workload by any means (results in ~30% higher power draw than any other conventional workload), but it essentially guarantees that your system can handle anything else you can throw at it if can handle Prime95.
 

BonzaiDuck

Lifer
Jun 30, 2004
15,699
1,448
126
I spent a lot of time looking into these experiences with LinX and Prime95 before I was ready to OC my 6700K.

I also did something that reduced my temperatures regardless of cooling with air or water: I paid to have the processor de-lidded and re-lidded with Liquid Ultra. This gave me a verifiable cooling improvement of 12C.

But barring this or excluding the use of extra, superb water-cooling, there is a short test using Prime95 that will pass or fail your "stability:"

Prime95 "1344" short test

41.png


If this passes for at least 15 minutes, the system is at least "close" to being totally stable. I've run this test for 30 minutes, 1 hour and an hour-and-a-half respectively. It can throw an error after less than an hour. With some minor voltage tweaks, I was able to assure error-free running beyond an hour.

I also run LinX "affinitized" so that the program operates with only four threads -- one per core. This has the advantage that you can evaluate the GFLOPs among the test iterations to see if there are wild variations. The closer together the GFLOPs to their average, the less likely the processor is doing error correction because of insufficient voltage.

I had tuned three OC settings for the i7-6700K: 4.5 Ghz, 4.6 Ghz, and 4.7 Ghz. The 4.7 setting was bootable and "apparently" stable, but untested and crude. I finally gave up my voltage apprehensions and took a slightly different approach to quickly finding the optimum setting for 4.7. I followed an engineering study of the 6700K which graphed the minimum load voltage necessary for the settings, and I was able to dial in a "perfect" 4.7 setting after entering BIOS only twice.

The LinX results show a GFLOPs range of 2 GFLOPs or 1 GFLOP on either side of the average. The actual variance or standard error would be somewhat less than that:

4_7%20Ghz%20target%201_408V_LinX-Loaded.jpg


The first iteration is always an "outlier" because that iteration is already running when you set the Affinity of the program. So you can throw out iteration #1. Anything else could be the result of some unforeseen but scheduled process like an AV download and update, but if you watch it happen, you can identify the cause and mark another iteration as "outlier." But I have none of those.

Note that the peak package temperature reached 77C for the entire run of 25 iterations. This is fine for me, but I relidded my 6700K with CLU! So the OC'er -- if choosing to use these tests -- should likely pick the multiplier which leaves the temperature closer to my own, and call it a day.

Otherwise, you're never going to "know" for sure.
 
  • Like
Reactions: DigDog and Elfear

Mr Evil

Senior member
Jul 24, 2015
464
187
116
mrevil.asvachin.com
This makes me wonder what the chip manufacturers do when binning chips. Do they test every possible combination of instructions? Just once, a few times, or continuously for an extended period? Does a chip pass if the results are correct, or do they measure electrical characteristics? I assume they also test things that don't apply to overclocking, like making sure it's stable with the voltage at the lower tolerance bound, at the maximum rated temperature.
 

StefanR5R

Elite Member
Dec 10, 2016
5,459
7,717
136
With Haswell-E and Broadwell-E, the question of whether or not to include AVX workloads in stability tests is moot again, because you can configure different multipliers for AVX and non-AVX workloads. Which makes a lot of sense given their strikingly different power consumptions.

Likewise, Haswell-EP and Broadwell-EP are wired up for different turbo frequencies, considering whether or not AVX is in play. (Actually, the trigger may in fact be AVX2, not just any AVX, but I am not sure about that.)

Broadwell-E and -EP improved upon that in that this discrimination is done per core, not globally for the processor.

I wonder when this useful feature will trickle down to the mainstream socket.
 

The Stilt

Golden Member
Dec 5, 2015
1,709
3,057
106
With Haswell-E and Broadwell-E, the question of whether or not to include AVX workloads in stability tests is moot again, because you can configure different multipliers for AVX and non-AVX workloads. Which makes a lot of sense given their strikingly different power consumptions.

Likewise, Haswell-EP and Broadwell-EP are wired up for different turbo frequencies, considering whether or not AVX is in play. (Actually, the trigger may in fact be AVX2, not just any AVX, but I am not sure about that.)

Broadwell-E and -EP improved upon that in that this discrimination is done per core, not globally for the processor.

I wonder when this useful feature will trickle down to the mainstream socket.

Broadwell-E/P is the first one to have AVX/2 frequency "bins". Haswell-E/P "bins" depend on the number of utilized cores.
 

EXCellR8

Diamond Member
Sep 1, 2010
3,979
839
136
I don't stress test my gear anymore, I just game. If it works fine for a few hours, I'm happy. If not, I try something else or dumb it down some.

you can run every bench under the sun and still encounter issues by just using the computer normally.
 

WhoBeDaPlaya

Diamond Member
Sep 15, 2000
7,414
401
126
memtest86+ to make sure the memory's in order (usually is since I run kits at stock)
IBT for quick probes
IBT for follow-up (100+ passes)
x264 Stability Test V2 for signoff

Worked well for me for everything up to Haswell. Don't own anything newer than that.
I should mention that I do "dumb" overclocking by using fixed Vcore and multi. None of that SpeedStep / Cool 'n Quiet and offset voltage nonsense.
 

VirtualLarry

No Lifer
Aug 25, 2001
56,226
9,990
126
I should mention that I do "dumb" overclocking by using fixed Vcore and multi. None of that SpeedStep / Cool 'n Quiet and offset voltage nonsense.

I do too, but that's basically the only way to do it, with BCLK OC, since SpeedStep doesn't really function.
 

escrow4

Diamond Member
Feb 4, 2013
3,339
122
106
Overclocking is never stable. Never was, never is and never will be. If the chip you buy needs more puff buy a faster model. You can scream till you collapse that all those tests are 100% stable but 100% stable is stock out of the box.
 

VirtualLarry

No Lifer
Aug 25, 2001
56,226
9,990
126
Overclocking is never stable. Never was, never is and never will be. If the chip you buy needs more puff buy a faster model. You can scream till you collapse that all those tests are 100% stable but 100% stable is stock out of the box.

There's a certain amount of truth to that. When I BCLK OCed my i5-6400 to 4.455Ghz (165.0 BCLK), it passes one hour of OCCT:CPU stress test, but after using Waterfox for a few days, sometimes a tab crashes. Which normally does not happen to me. So, I'm left wondering if it's the OC. No BSODs or anything severe, not even an appcrash at all. But just some squirrelly behavior.

Edit: Then again, I'm not doing Finance or Engineering work on this PC, so I can "accept" a minor amount of instability. (As long as no BSODs, appcrashes, and my distributed computing WUs complete successfully.)
 
Last edited:

pantsaregood

Senior member
Feb 13, 2011
993
37
91
Overclocking is never stable. Never was, never is and never will be. If the chip you buy needs more puff buy a faster model. You can scream till you collapse that all those tests are 100% stable but 100% stable is stock out of the box.

This simply isn't true. This would imply that even if a CPU you owned would pass Intel or AMD's stress standards for higher clock speed that it still wouldn't be considered stable.

Every Core i5-6500 is not a failed Core i5-6600. At least some of those CPUs that ultimately went to market at i5-6500 could've been marked as i5-6600. By your reasoning, "stable" is whatever Intel or AMD say it is. Granted, taking their word for it didn't really pan out for the 1.13 GHz Pentium III Coppermine, did it?

Also, running FFT length of 1344 destroyed my AVX/AVX2 performance. Multiplier had to be dropped to 37 on the 6800K for AVX loads. Standard loads pass for 4.4 GHz easily.

EDIT: I will accept that no overclock is truly stable if and only if we accept that no CPU is truly stable at any given speed.
 
  • Like
Reactions: .vodka

VirtualLarry

No Lifer
Aug 25, 2001
56,226
9,990
126
EDIT: I will accept that no overclock is truly stable if and only if we accept that no CPU is truly stable at any given speed.

"Unstable at any speed"... Cyrix. :)

More realistically, maybe escrow4 really means, that we just need to accept that Intel's binning testing is more thorough and exact than any amount of testing that we can do with software out in the field. Which is not outside the realm of possibility, I think.
 

superstition

Platinum Member
Feb 2, 2008
2,219
221
101
we just need to accept that Intel's binning testing is more thorough and exact than any amount of testing that we can do with software out in the field. Which is not outside the realm of possibility, I think.
The Stilt said AM4's droop spec is tighter than Intel's as I recall. That implies, that with tighter LLC on some boards, better results can be had

Plus, with delidding, there is the removal of the polymer TIM thermal bottleneck.

And, really... Piledriver is the poster child for how the "overclocking is never stable" mantra isn't always accurate. Unlocked multipliers, soldered chips, improved process, very loose AM3+ spec... Add those together and you have a recipe for overclocking room.

Overclocking is just more demanding than what you get with a cheap board, cheap cooling, and a cheap PSU. The real key is that you just have to have a higher set of standards to stabilize an overclock.
 

BonzaiDuck

Lifer
Jun 30, 2004
15,699
1,448
126
I've given my share of thoughts about this either way.

There is no "universal stress test." It's better to use several or more than one. The trick is in being able to find the right clock and voltage settings with the fewest BSODs along the way, with the shortest amount of time under stress and the greatest certainty that errors would be uncovered if there are any to find.

I think I started building computers in their entirety around 1994. Before that, various hardware klooges with what was already in the OEM box. But for a long time, building machines from scratch, I wouldn't fool with BIOS settings, nor would I run the processor and other components "out of spec." I only began to do that around 2004.

If I weren't planning to overclock from the onset, I wouldn't spend my money on "K" processors or "-E" CPUs. [In fact, I had X79 and X99 system plans for which I never followed through.]

But it's as simple as binary numbers. You either have problems with an overclocked system for a range of things like insufficient voltage, too much heat, or anything else that contributes to instability -- or you don't. If you don't, the proof is in the computing result: again, a binary outcome -- the processor either produces any number of results flawlessly, or it doesn't.

"Instability" can have any number of causes. Oftentimes, a BSOD can indicate a hardware or hardware configuration problem -- which is just a polite way to express the outcome of an overclock exercise. And just as often, those same BSODs are the result of a problem with a driver, some conflict between two or more hardware items, corrupted files or OS.

It isn't necessarily true that an overclocked computer is a "bad computer." It is more true that overclocking complicates the troubleshooting of problems that might exist or occur without overclocking.

But if one limits his ambitions to stay within most of the processor specs, the only departure from recommended processor settings is that of the speed in Ghz. And if they didn't intend for consumers to change the CPU multiplier, they wouldn't have produced "K" processors with unlocked multipliers. they wouldn't offer "overclocking insurance" for a price. And if the overclocked speed has nothing to do otherwise with any "unreliable" results, then it cannot be said that it's a bad practice.
 

rvborgh

Member
Apr 16, 2014
195
94
101
for my quad opteron running extra spicy's it was memtest for a few hours, and then IBT for over 500 passes. i should add that at the time the HVAC at the office (where i was putting the thing together) was down, and it was winter, and the machine helped to raise temps in the office by 5F :) rock solid stability though :)
 

WhoBeDaPlaya

Diamond Member
Sep 15, 2000
7,414
401
126
for my quad opteron running extra spicy's it was memtest for a few hours, and then IBT for over 500 passes. i should add that at the time the HVAC at the office (where i was putting the thing together) was down, and it was winter, and the machine helped to raise temps in the office by 5F :) rock solid stability though :)
Ugh, that reminds me of the days when I would wake up freezing in Midwest winters.
It's how I knew that my Bitcoin mining farm was on the fritz :p
 
  • Like
Reactions: rvborgh

DrMrLordX

Lifer
Apr 27, 2000
21,582
10,785
136
I should mention that I do "dumb" overclocking by using fixed Vcore and multi. None of that SpeedStep / Cool 'n Quiet and offset voltage nonsense.

It's really the only way to go if you're serious about the performance and stability.

Overclocking is never stable. Never was, never is and never will be. If the chip you buy needs more puff buy a faster model. You can scream till you collapse that all those tests are 100% stable but 100% stable is stock out of the box.

Sometimes there isn't a faster model.
 

pantsaregood

Senior member
Feb 13, 2011
993
37
91
Testing the 6800K with FFT length 1344K showed me just how poorly Broadwell-E scales with voltage.

Stable at 1.24V/4.2 GHz. 4.3 GHz isn't stable at 1.46V. AVX is ridiculously unstable above 3.6 GHz.
 

lefenzy

Senior member
Nov 30, 2004
231
4
81
Overclocking is never stable. Never was, never is and never will be. If the chip you buy needs more puff buy a faster model. You can scream till you collapse that all those tests are 100% stable but 100% stable is stock out of the box.

I agree. Why bother with the heat and noise.

I'd rather undervolt (which also has potential instability issues).