Something "Hinky" About Win 10, stress programs and OC settings

BonzaiDuck

Lifer
Jun 30, 2004
16,476
1,949
126
So.

I have scattered the blather about my marvelous i7-6700K "system-on-air" across several forums, beginning with my temperature results for a system with a processor-CLU-re-lidding.

I got my initial clock settings done for a notch lower than initially-proven expectations using Windows 7. The processor was binned at 4.8; my best -- limited by my personal voltage concerns -- was 4.7 with maximum loaded VCORE at 1.408V.

There were other things that took my attention. A couple driver-instability problems (even in Win 7) caused me to reset at stock and then pick 4.6 for 24/7. But I knew I had 4.7 in my back pocket.

I had also added Win 10 in a dual-boot configuration -- also much discussed in my scatter of thread posts here and there. [Much discussed because of runaway fingers and thoughtless self-absorption, perhaps . . . or just enthusiasm.]

Attention was then turned to the storage system. I began to work more and more with Win 10. The worst unexpected hurdle I faced came in the form of Build 1703, both for the dual-boot BCD issues and an undesired notice from Windows at boot-time about a drive volume. And I'm on top of the world again for that. Beautiful! Superb! Su-u-u-perior! Super!

I told myself this morning that it was time to "get back" to 4.7 Ghz, but it needed validation with some stress runs.

I got through Linx affinitized 25 iterations without error, but the GLOPS were not "tightly grouped," and I was suspicious of the "residuals" column, which should all be the same. I ran the Prime95 test for voltage and speed stability -- an "in place" Prime 95 custom setting with Min and Max FFT size of 1344. 15 minutes was supposed to prove it; it ran error-free for 90 minutes.

Then, I decided to run Prime95 small FFT. System would throw an error in a single thread between 1 minute and 6 minutes. Decided to double-check with OCCT's own CPU test. Error in 6 minutes.

I started tweaking the VCCIO and VCCSA in small increments -- no cigar. Reduced the memory speed from 3,200 Mhz ddr to 3,000 -- no cigar.

Started bumping up the VCORE. moving up about 10 mV, OCCT would run an hour and a half before erroring out. Tried Prime: about 10 minutes.

So last round of attempts, I had pushed the voltage to a point where VID was showing 1.44V and minimum loaded VCORE wouldn't drop below 1.424V.

No cigar.

Back to 4.6 Ghz, VCORE at 1.37V with a 20mV margin for comfort -- No Problemo.

Prime95 small FFT just keeps on ticking, the processor stands up to the licking.

Has anyone noticed this? That your stress tests would leave you with different limits in different OS versions? That somehow, certain stress tests would fail quickly at the same settings in one OS version compared to the other?

Folks had been telling me recently "Why do you keep Win 7 bootable?! It's ten years old!"

Whatever -- but there's a discrepancy here. I also have a couple theories about what you get with a binned processor. I don't blame Silly Lots, but it just stands to reason.

I am sure that my processor has not degraded. And I will return to Win 7 and re-run the tests of several months ago at the 4.7 setting. But this peculiar difference I had noticed very early in the game -- I just didn't give it my attention for a long time. I've only started working with Win 10 a lot more in the last two months . . . .
 

krumme

Diamond Member
Oct 9, 2009
5,956
1,595
136
4.6 1.37v is where you are. And its fine.
Win 10 is not messing things up but stresstesting Intel is just hard work. The hinks and lockups sometimes comes after a long run and what seems randomly and without sense. Thats my experience though i havnt noticed os differences as you.

As H points out zen is actually more easy here. Its more like it either works or not. Probably because its more like a wall so you seldom work on the edge.

Perhaps Intels process is more scalable for freq and voltage - spans a greater area/freq - and therefore the limits or edges is bigger? Just speculation ofc.
 
Last edited:

BonzaiDuck

Lifer
Jun 30, 2004
16,476
1,949
126
You're probably right. I just think it raises a point I may have tried to make before about what people report, the circumstances of their claims, and fuzzy standards.

Here's what I mean. I can easily boot into 4.7 Ghz at the voltage I have set now. There are several stress tests that will pass. You might be able to game and multi-task all week long with adding only 20mV; it might sustain some more demanding rendering tasks. And some folks might say "the real stress test is real-world application." But if it can't pass Prime95 sFFT at 4.7 and the voltage shown by engineering studies to run at that speed (1.408V), then it's not completely reliable. And there's no sense in volting it that high if you can't be sure.

I'm definitely going to go into BIOS tomorrow, select the 4.7 profile, and see what happens again in Win 7 so I can be sure.

But like I said, I noticed it back in November, after adding the Win 10 OS. I think that's why I called it a day, because I was doing more work in Win 7 -- hadn't grown all that comfortable with Win 10. I'm absolutely sure that a 25-iteration of Linx with affinity set to 0,2,4,6 gave me GFLOPS in Win 7 that traversed a statistical range of only 2 GFLOPS at most, and the residuals were all identical as they should have been. And the load voltage showed to be exactly 1.408V.

Probably the best way to look at this, if I can't find some miraculous tweak, is that Prime95 would be pushing the temperatures to 84C if it weren't for my cooling strategy, which gives a peak package temperature under LinX or Prim95 of about 72C at 4.6 Ghz.
===========================
And an afterthought a day later -- I've changed my sig per my 24/7 setting -- determined to be modest and objective after this discovery. Looking again at reviews I read for the ASUS Z170 boards and my own assessment of the specs, the failure to achieve total stability at 4.7 for this processor are as much likely due to the difference in motherboard phase-power design. My board gets 8+4 and only 8 are relevant to the CPU exclusive of iGPU. The Maximus and Pro/Deluxe boards have 12+4.

I think I lose 100Mhz for cooling (but so would several dual-fan H20 systems), and I lose 100Mhz for my board. "Stability" begins at around 1.385V for 4.7, but it still won't pass Prime95 in Win10 at ~1.424V. I guess this is what happens when you put budget before end-project performance. But -- it's good enough for me . . . Still, I'm going to re-run tests in Windows 7. Maybe today . . .
 
Last edited:

BonzaiDuck

Lifer
Jun 30, 2004
16,476
1,949
126
Hmm. The plot thickens.

I'm currently running affinitized LinX under WINDOWS 7. Doesn't miss a lick. I can post the screenies, but all the residuals are identical, no errors occurring, and the statistical range (lowest to highest) for GFLOPS is < 2 GFLOPS -- a nice, tight grouping. I am going to run the full 25-iteration run on this, and then turn my attention to Win 7 with Prime95 sFFT. Remember -- the Win 10 LinX results were all over the map -- in GFLOPS and in residuals that would indeed indicate error (although the program didn't stop -- as it's supposed to.)

I'll merely update this post. However, at this point I will note a difference in the configuration of the Win 7 and Win 10 OSes for running these stress-tests: Win 7 has no PrimoCache caching tasks; Win 10 caches the boot-system volume to RAM. A separate SSD caching task is dedicated to a non-OS disk that would have no reads or writes for these tests.

It's either Windows 10, the combination of Win 10 and Primo, or it's Primo. Some of the folks on the Romex forum, with some symptoms following the installation of Build 1703, vowed to suspend their use of Primo until their problems had been addressed. But except for stress testing, my Build 1703 was error free and remains so.

To recap -- with Win 10 -- Prime95 sFFT throws an error within 5 minutes and OCCT:CPU throws one in 10 minutes with the VCORE twisted all the way up to 1.42V and 4.7 Ghz. For this Win 7 suite of tests, VCORE loads to 1.408V, and no problem so far.

Temperatures are what I would expect during a warmer season: Maximum Package value under LinX-testing for 4.7 and 1.40V registers 80C, but mostly falls in the 75 to 79C range -- as I would expect. This is at least consistent between Win 7 and Win 10 testing.

We'll see what Prime95 and OCCT prove in the next hour or two, but I'd been this route before, I assure . . .

Without destroying the contents of my Win 10 SSD cache, I can terminate the OS-system-disk's RAM cache in Win 10. If it doesn't make a difference, I'll have to delete that cache temporarily to confirm that Primo figures into these anomalies.

I'll have to get my screenies together to post them, and I'll attempt to do that if it's necessary. But it sure would be interesting to find that the caching software has nothing to do with it. Then -- the finger will point to Win 10, whether Build 1607 or Build 1703. I mentioned that before, as I remember.
 

BonzaiDuck

Lifer
Jun 30, 2004
16,476
1,949
126
OK -- Prime95 sFFT chugging along in Win 7, 4.7Ghz, 1.408V -- will go through the hour but confirmed likely to be as stable as before with Win 7 -- by this test and others. We've passed 40 minutes, but the original tests at those settings ran for 4 or 5 hours.

We'll get to the bottom of this. Next to run some controlled tests under Win 10 without the RAM caching.

It still begs the point. If stress-testers in Win 7 long-trusted for that OS version show the comfort of stability in that OS, what do the Win 10 results mean? Why do those results prove consistent with those in Win 7 when the tests are applied to settings for 4.6Ghz? [That's an observation that would make you question the relevance of the Primo software in this mystery.] And how reliable is it, to validate the clock settings in 7 and then running those clock settings in 10?

This system has never BSOD'd in either OS for the 4.7 Ghz settings. I had dropped back to stock or 4.5/4.6 a few months ago for installation of my 960 Pro and other things. What does it mean? What's the risk, for not validating my 4.7 settings in Win 10?

I insist on finding out. The folks who develop these stress apps have had plenty of time to tweak them for full Win 10 compatible and reliable operation.

But it doesn't seem to be "the processor." Doesn't seem to be the appropriate voltage and clock settings. And -- nobody else here that I know of has discovered something like this, but they should have done so by now. I'm suspecting the possible software conflict I'd suggested earlier, since there may only be a handful of PrimoCache users at Anandtech.
 

BonzaiDuck

Lifer
Jun 30, 2004
16,476
1,949
126
DELETED: RAM-cache task for boot-system-volume only and other tasks still active.

Long-since past the time it should have crashed under OCCT:CPU, running just fine.

Romex may need to update their software a bit more. But at least it can still be used. I'll have to post it there.

THERE, they will way "Don't over-clock, A[So-and-So-ul!]" Over here, the Primo advocates probably number 1 out of 3 among respondents I'd had who mentioned them. Old Hands of the forum prefer Ram-caching for servers but are hands-off for their personal workstation.

So the suspected Apps together: Windows 10 (most certainly), Primo-Cache, OCCT, LinX and Prime95.

Begging the question: "If I run Primo for 4.7 Ghz tested in Win 7, does its own operation pose a risk? Or do I simply avoid running the stress-test programs when Primo is caching the system disk?"

An obscure and obtuse problem. From a prolix and almost lone poster to the thread he started.

Hey! TLTR! No problem with that! Everybody here knows I'm a bit . . . strange . . . .
 

krumme

Diamond Member
Oct 9, 2009
5,956
1,595
136
Lol. It is damn complicated what causes it but it is an instability caused by oc. I dont know if it would bother me.
But i never went for the last 5% anyway.
Its so edgy perhaps even changing powersuply could alter the results and yes mb phases as you mention.
 

BonzaiDuck

Lifer
Jun 30, 2004
16,476
1,949
126
Lol. It is damn complicated what causes it but it is an instability caused by oc. I dont know if it would bother me.
But i never went for the last 5% anyway.
Its so edgy perhaps even changing powersuply could alter the results and yes mb phases as you mention.

All I need to do is eliminate caching from the boot-volume where the stress-test program is stored -- to which it also writes, also freeing up some 4GB of RAM. The problem disappears.

It is a nexus of conflicts or problems that include Windows 10, Romex PrimoCache and the testing programs. What does Primo do? It grabs as much RAM for its usage as you want to give it. What do the stress-test programs do? You tell them to grab as much as they can if you want maximum stress and heat. It should be the case that Windows manages RAM so that the two other programs don't interfere with each other.

It's gone 2 hrs 40 minutes (+) on OCCT: CPU with Win 10 -- NO RAM-caching of the Win 10 boot-volume. All the other caching tasks are active, but I deleted that one to see what happens. Doesn't miss a lick. Next will be a LinX affinitized test, to see if the numbers look proper -- all the residuals and normalized residuals for each and every iteration should be identical. Then we'll see about Prime95 sFFT.

You are right in your intuition that the OC has something to do with it, since everything seems OK in the 4.6 Ghz profile. But now I'm going back to test each and every thing I can. I'm just very confident now. Anyway, it sure wasn't evident that the OC was causing trouble during a Win 7 boot session, was it? Clean bill of health for 4.6 or 4.7. But Win 7 currently has no caching tasks.

This is what you get from adding extra steps to your mountaineering on K2. But I think I'm almost at the top.
 

BonzaiDuck

Lifer
Jun 30, 2004
16,476
1,949
126
Interesting. Run the stress programs from uncached drives with Primo. No telling what might happen with the Intel solutions including the Optane application.

Prime95 [most recent] sFFT ran 3 hours before I terminated it. "0 errs, 0 warns". It would bottom out under severe load at 1.392V, so I bumped up a couple mV so it would show a solid 1.408V.

OCCT:CPU [downloaded today] 3 hrs 30 min to pass and manual termination.

IntelBurnTest [near]-"Maximum" allowing 500 MB of slack in the RAM selection, all residuals identical to the last digit and exponent.

LinX. Heats up the system to the max (80C); throws oddball residuals and keeps on ticking.

So PASS PASS PASS FAIL

I don't think any Skylake honchos are testing with LinX. Are any Skylake users testing with LinX?

All the other tests are limp: Intel's XTU; AIDA64; ROG RealBench as I remember. OK -- I'll try RealBench next.

Something screwy about LinX in Win 10 as opposed to Win 7.

Gotta look into these last few issues. Then I'll have to decide whether this is a winter-time-only 4.7 setting or a 24/7/365 setting. These things are sensitive to temperature. I was certainly told so with printed proclamations from "on high."

At least I've found a setting worth keeping until the LinX problem is resolved. Maybe I shouldn't bother resolving it. All the other tests max out in Package temperature between 75C and 77C. I don't even know if there are any newer versions of LinX. I think I even had trouble finding it on the Web for download last I looked.

Maybe I should shut up. But I trust 3.5 hours of OCCT:CPU, 25 passes at IBT, and Prime95 for more than 3 hours bodes well for 4 or 5 under a certain statistical assumption. This is at least worth saving to Profile and tweaking some more later. MORE: I apparently failed to configure LinX properly for 4 threads, 4 cores. Suddenly a tight grouping of GFLOPS. All residuals identical (but one in the last short set of iterations.) More investigation merited. Well -- I should've figured all this stuff out without posting and wasting time of people who find my stuff TLTR.
 
Last edited:

BonzaiDuck

Lifer
Jun 30, 2004
16,476
1,949
126
Lol. It is damn complicated what causes it but it is an instability caused by oc. I dont know if it would bother me.
But i never went for the last 5% anyway.
Its so edgy perhaps even changing powersuply could alter the results and yes mb phases as you mention.

This has all seemed to be about a relatively new OS, long-established stress testers, new hardware, and a nexus of program interactions regarding memory management.

But it is now certain that a 4.7 Ghz with adaptive turbo VCORE of 1.408V and a BIOS setting between 1.392 and 1.396V. All the other tests make this feasible for 24/7 operation. Only LinX heats things up past 80C with room temperature around 79F. The rest of the programs only push to mid-70s, as I may have said. BurnTest runs even cooler than that. Everywhere I look, there are people suggesting that LinX is too hot, and unnecessary. But looking for a tight GFLOPS variance is a great shortcut. This will all get tweaked to perfection probably over the next few days or within the week.

I should warn myself when engrossed in this type of activity and looking for answers, not to let my fingers get away from me and post too hastily here. I've become a prolix nuisance.
 

krumme

Diamond Member
Oct 9, 2009
5,956
1,595
136
Your brain works better when talking or writing ;)

Its a matter if definition. An overclock works if it works with the stuff you ask from it imo. If its f1 car industrial production machine or a cpu. Its doesnt have to work out of this context. The f1 car is 100% stable and it doesnt have to pull a trailer to prove so.
 

BonzaiDuck

Lifer
Jun 30, 2004
16,476
1,949
126
Your brain works better when talking or writing ;)

Its a matter if definition. An overclock works if it works with the stuff you ask from it imo. If its f1 car industrial production machine or a cpu. Its doesnt have to work out of this context. The f1 car is 100% stable and it doesnt have to pull a trailer to prove so.
I appreciate that, and it points toward my observation that some overclockers are a bit loose and too accepting of their stability achievements. I've read other forums -- Toms or OverClockers for instance -- where people are proclaiming "stability" of a Skylake running 4.7 or 4.8 and Adaptive Turbo VCORE of "only" 1.35V.

But an engineering study -- I can post the graphic I lifted from it -- used a sample of several CPUs and charted the mean stability threshold for them. I used it as a guide. For instance, the sample suggests that a stable voltage for 4.5GHz is 1.282V. I would have questions as to what was meant: Was it the BIOS setting itself? Was it the VID? What was the standard of stability? I proved for myself that the author of the study did some thorough stress-testing. My processor may have fallen short of the several mean values, or it may have exceeded them. By falling short I mean that it might have required more than 1.282, and exceeded would mean a slightly lower voltage. But we conceive of those mean values as part of a bell-shaped statistical distribution.

So I would simply set my VCORE so that the most severely loaded and drooped VCORE shown in stress matched the mean values. EVen if I could drop the voltage a bit, I might want to stick with the reported mean of the study's sample just as a margin of comfort. And at every step -- from 4.5 to 4.6 to 4.7 Ghz -- all I needed to do was start with the study's mean value for that speed, and I could tweak as much as I want and never have a BSOD, with the stress software managing the handling of errors and reporting them.

Even for that, I'd want some test to show that the processor wasn't passing tests while doing its own ECC error correction in the background -- that was the value of Affinitized LinX. that's why I had to troubleshoot this problem I had with the stress testing software in Windows 10 that worked properly in Windows 7. Even with a sample of 10 iterations, if GFLOPS minimum value shows at 205.3 and maximum 207.1, you have some partial indication that the error correction is at minimum. Sometimes, there will be outliers: you have to set affinity for the Xeon-64 part of the Linx program during the first iteration, so it's likely going to fall outside the range you want to measure. Sometimes background processes slow down the test at this or that iteration. But if you have enough ~206 GFLOPS in succession -- in series -- it raises your confidence in the voltage setting.
 

BonzaiDuck

Lifer
Jun 30, 2004
16,476
1,949
126
Verbose? Yeah. Nuisance, nah. I read your posts whenever I can, but I'll be honest, I tend to skim some of the longer ones, if I don't have the time to read them right away.
You're so kind!

This hinky behavior of Windows 10 really threw me for a loop. Windows isn't managing RAM properly for both my caching program and stress-testers. As soon as I pause a cache or simply delete the caching task, the problem goes away. The problem with LinX arose because I didn't think to set it up properly after I had installed Win 10 as the second OS: "4 cores" and "4 threads." [You may remember that IDontCare had worked all this out.] You then have to Set Affinity in "Details" of Task Manager.

All I can say is that everything checks out now. Prime95 sFFT, lFFT and Blend all run happily. The voltage is set to a point where there may be even a margin I can trim back, but the loaded voltage reported is rock-solid on the desired value of 1.408V.

All of this arose with the my coincident desire to reset the 4.7 Ghz setting and the trouble we had with the Creators Update. Got through all of that. The folks at Romex will need to patch their caching program. Who knows what MS is going to do?

A little uncomfortable with the temperatures @ 4.7 peaking at ~ 81C on the LinX test. The others are fairly limp for heat by comparison -- between 72C and 75C. The OCCT CPU test gives lower than that. IBT is also limp in this regard. All in all -- no disappointments after all with my intentions for this system. I've considered it to be "in progress" for (what?) -- more than six months now.