Scaling down Haswell from 4.8GHz to 3.4GHz.

BallaTheFeared

Diamond Member
Nov 15, 2010
8,115
0
71
It was brought to my attention that you could limit Haswell's power consumption via the "Processor current Limit" feature on many (most? some?) motherboards.


I also have a kill-a-watt, but the problem there in singling out cpu power consumption isn't black and white. First you have to look at everything else taking power. Secondly you have to consider PSU efficiency, none have a linear curve as far as I'm aware.

I spent a lot of time running these, I hope it wasn't a pointless endeavor :hmm:

I will run them again tomorrow and get peak kill-a-watt readings without my 7950s running two monitors and having ULPS disabled.


Method: No where near IDontCare level of precision. I don't have the knowledge nor the tools he possesses. I did what I could in a way I thought made sense. I could not precisely control temps and thus remove their effect on power consumption. In the interest of consistancy each run had only four things changed. The uncore and vcore, as well as the core and uncore speed. Each run is an exact 1:1 core to uncore ratio, so 45 core used 45 uncore. The ram was kept at 1.5v 1866 MHz.

For the test because I wanted to go beyond what would be possible with AVX I used Prime95 26.6 (In-place Large FFTs). This was the last Prime95 that came out before they started using AVX. I also thought since we often ignore the performance benefits of AVX and no legit reviewer uses these power viruses to gather power consumption figures it would give a more reasonable comparison to pre sandy bridge systems, namely the first gen core i series and Core2.

To get the amperage I simply used MSI | Intel Extreme Tuning Utility to adjust how much amperage was allowed for the processor while Prime95 ran. I let it run for several loops at each speed to ensure the amperage was high enough to allow full power draw. If I gave it too little the processor would throttle the clock speed to maintain it's allotted power consumption. Too high and it would ruin the point of the test, so I let Prime95 run for about 15 minutes on each speed (hence the time effort).

Anyways, here is what I have right now.

Haswell Power Consumption

PHP:
4.8GHz

1.325v X 147.25A = 195.1w


4.7GHz 

1.265v X 128.250A = 162.23w


4.6GHz

1.2v X 109.875A = 131.85w


4.5GHz

1.145v X 96.5A = 110.49w


4.4GHz

1.1v X 85.875A = 94.46w


4.3GHz

1.055v X 76.25A = 80.44w


4.2GHz

1.04v X 71.875A = 74.75w


4.1GHz 

1.02v X 67.75A = 69.1w


4GHz

1.0v X 63.5A = 63.5w


3.9GHz

0.97v X 58.25A = 56.5w


3.8GHz

0.95v X 54.75A = 52.01w


3.7GHz

0.925v X 50.625A = 46.82w


3.6GHz

0.895v X 46.375A = 41.5w


3.5GHz

0.87v X 42.625A = 37.08w


3.4GHz

0.86v X 39.0A = 33.54w


Please let me know if I wasted my time, because I'm planning to waste more... Save me!
 
Last edited:

BD231

Lifer
Feb 26, 2001
10,568
138
106
Almost there. Just need a hazmat suit some osculating fans and table decorum with high concentrations of lead for the tater tots.
 

Durp

Member
Jan 29, 2013
132
0
0
I knew Haswell sipped power but that result at 3.4GHz is just insanely low.
 

NTMBK

Lifer
Nov 14, 2011
10,525
6,050
136
Nice work! Needs moar graphs, though.

Not sure I agree with your arguments about AVX(2), though. You're using a quarter of the theoretical peak FP performance (128bt vs 256bt with FMA), and only half the vector width. I think the same experiment using AVX2 would give some very different results- which would be very interesting to see, for comparison!
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Looks awesome Balla!

If you have the temperature data as well then please report that too.

The CPU power consumption will appear to be way too exponentially dependent on clockspeed or voltage if you don't account for temperature.
 

JimmiG

Platinum Member
Feb 24, 2005
2,024
112
106
I knew Haswell sipped power but that result at 3.4GHz is just insanely low.

Very few Haswell chips can do 3.4 GHz at such a low voltage, though. The stock voltage is just under 1.1V, which is why the TDP is specified at 84W. Normally they need 1.2V for 4.2 - 4.4 GHz. Balla has a very special 4670K :p

The Max Turbo Current/Wattage is actually pretty neat. In theory, you could use it as a "soft limit" for your OC in order to fine-tune it for your cooler. For example, you could limit your OC to 150W and then set the Turbo multi to something like 48x. This way, you might get 48x in App 1, which doesn't use AVX, but with App 2 that uses AVX2, the overclock would automatically be limited to e.g. 43X and in App 3 it might be something like 45x. Temp and wattage would be the same in either case.
 
Last edited:

zir_blazer

Golden Member
Jun 6, 2013
1,266
586
136
Assuming the results are reproducible and somewhat accurate, what you're showing is that Haswell scales EXTREMELY WELL on undervolting conditions. I think that at 3.4 GHz its power consumption around half of what Idontcare Sandy Bridge did. 3.4 GHz looks to be a sweet spot considering how it scales up from there, and should be enough for nearly all everyday task. I suppose that Fanless should be also possible.
To think that when I did this Thread most people talked about how efficient Haswell already was without manual tweaking - NOT.

However, regarding this...

To get the amperage I simply used MSI | Intel Extreme Tuning Utility to adjust how much amperage was allowed for the processor while Prime95 ran. I let it run for several loops at each speed to ensure the amperage was high enough to allow full power draw. If I gave it too little the processor would throttle the clock speed to maintain it's allotted power consumption. Too high and it would ruin the point of the test, so I let Prime95 run for about 15 minutes on each speed (hence the time effort).

Can't you leave the Amperage unlimited, max Processor draw, or something? What happen if you put a limit that is too high? Supposedly, it should draw just the power that is going to use, not more. Through I'm not sure about what tinkering you can do with Haswell.
 

Yuriman

Diamond Member
Jun 25, 2004
5,530
141
106
Here are some of my Ivy Bridge results which I never posted. This is from before I fully delidded, and my range of values aren't nearly as large as yours. Additionally, I was using AVX instructions when loading my chip, and was using IETU's "wattage" slider to find where it throttles. I suspect it calculates the wattage from the requested voltage, and not the actual voltage.

4.3Ghz - 1.168v (+0mv) - 56c Linpack loaded

4.4Ghz - 1.192v (+31mv) - 67c Linpack loaded - ~92w

4.5Ghz - 1.240v (+82mv) - 76c Linpack loaded - ~104w

4.6Ghz - 1.296v (+140mv) - 80c Linpack loaded - ~120w
 
Last edited:

Kenmitch

Diamond Member
Oct 10, 1999
8,505
2,250
136
Balla....I also like to play around with baby overclocks. I'm currently doing something similar but using Linpack 11 to get more realistic results using the advanced feature set of Haswell. Might take me another day or so to complete my findings.


I want to see this also. My 4670k sips voltage at the lower clocks and can still run Linpack 11.

Why the Redline stability testing?....Ebay joke :)
 
Last edited:

BallaTheFeared

Diamond Member
Nov 15, 2010
8,115
0
71
Thanks guys, glad it wasn't a wasted effort!


Looks awesome Balla!

If you have the temperature data as well then please report that too.

The CPU power consumption will appear to be way too exponentially dependent on clockspeed or voltage if you don't account for temperature.


What would you report, peak on one core, or avg across all or avg on the package, or peak on the package? ^_^

I wish I had more direct control over the fan to lock in exactly the temp I wanted and the fans would ramp up or slow down based on where they were at the target, but alas I do not :(


Temp variance from 3.4 to 4.8GHz is about 40C, which I know is huge when comparing power consumption, I just don't know what to do about it since the power consumption at 3.4GHz is so much easier for my $50 cooler to handle than what is presented to it at 4.8GHz.

_____________________


I thought there might be some legitimacy concerns about my voltage/clock ratios, so I let it prime the 3.4GHz (which I thought would draw the most skepticism due to it's very low voltage and near stock speed).

stress_zpsf9f81a38.png~original


8 hours 40 minutes 0 errors 0 warnings, longest I've stress tested Haswell since I got it >.<
___________________________

On the topic of AVX vs standard. I'm torn, personally. On the one hand I know I'm not using the full fury of the processor, basically fusing int/fp together creating a octo 128bit chip with simply amazing performance. However on the other side I know even non AVX is crushing everything else I do with my PC as far as strain and temps go.

For instance I run 4.8GHz in BF3 with vsync off with my two 7950s overclocked and the bottleneck is always on the processor and it's always running between 91 and 97 percent usage, max temps after hours of gaming end up in the mid to low 60s, however in prime even without AVX I'm reaching the 80C range. Even programs like Handbrake which have started to use AVX(2) and FMA3 don't get nearly as warm as this non AVX run in cache Prime does.

Nobody cares that Haswell gets 200 GFLOPS in a linpack whereas Nehalem gets 50 GFLOPs or Westmere gets 85.

Most reviews use apps like x264 or Cinebench to get "real world" multicore power usage, using Prime95 Large FFTs is already exceeding what you see in reviews.
 
Last edited:

Kenmitch

Diamond Member
Oct 10, 1999
8,505
2,250
136
I agree somewhat on the avx extension testing. Last I googled for avx supported apps it didn't seem to find much other than stress testing apps. Will change in the future I guess.

The nice thing about Linpack 11 or similar is it gives a worst case scenario as far as stability.
 

tential

Diamond Member
May 13, 2008
7,348
642
121
So I save a Lot of power by scaling down my chip to 3.4 ghz.

The question is power to performance ratio. I'd have liked to see 4.8 ghz handbrake vs all the way down to 3.4 ghz handbrake (or some other program).

Also some numbers below 3.4 ghz. I mean, for every day tasks, you could easily get away with 2-2.4 ghz. I'm running at 2.2 ghz penryn and dont notice much of a different when using my 4770k. My 4770k prettymuch sits idle all day though. I haven't used it in weeks.
 
Last edited:

BallaTheFeared

Diamond Member
Nov 15, 2010
8,115
0
71
So I save a Lot of power by scaling down my chip to 3.4 ghz.

The question is power to performance ratio. I'd have liked to see 4.8 ghz handbrake vs all the way down to 3.4 ghz handbrake (or some other program).

Also some numbers below 3.4 ghz. I mean, for every day tasks, you could easily get away with 2-2.4 ghz. I'm running at 2.2 ghz penryn and dont notice much of a different when using my 4770k. My 4770k prettymuch sits idle all day though. I haven't used it in weeks.

Yes, with any uarch the performance per watt is going to progressively decrease the higher you raise the clock speed.

I'm not going to find the wattage value for Handbrake, because I know it will fluctuate during the run due to its use of AVX(2) and FMA3. I can however confirm that both 3.4GHz and 4.8GHz use either the same, or less power than the Prime95 runs.


I tested 3.4GHz and 4.8GHz with the same setup we used in the Handbrake thread, however unlike there my RAM isn't overclocked so my results are lower than what I can achieve with the ram tweaked.


3.4GHz

[17:47:33] work: average encoding speed for job is 221.014725 fps

Estimated perf/w 6.589 FPS per Watt.


4.8GHz

[17:55:25] work: average encoding speed for job is 306.442535 fps

Estimated perf/w 1.57 FPS per Watt.


Pretty huge swing, 38.5% increased performance was quite costly on the power side :D
 
Last edited:

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Yes, with any uarch the performance per watt is going to progressively decrease the higher you raise the clock speed.

For the CPU alone, that is true. But the performance/watt for the entire system (which must be powered on and operating in order for your CPU to pump out those frames) results in lower FPS/W at the low end as well.

GFlopsperWattversusClockspeed.png


If you want to specifically discuss the nuances of a microarchitecture itself then it is reasonable to ignore the power consumed by the platform as far as your power company is concerned.

If you are comparing Haswell to other processors, or comparing Haswell to other Haswells at different clockspeeds for performance/W metrics, then you really ought to be including all the juice that is going into enabling all that performance to exist in the first place (which is the system power).

What would you report, peak on one core, or avg across all or avg on the package, or peak on the package? ^_^

I wish I had more direct control over the fan to lock in exactly the temp I wanted and the fans would ramp up or slow down based on where they were at the target, but alas I do not :(

Temp variance from 3.4 to 4.8GHz is about 40C, which I know is huge when comparing power consumption, I just don't know what to do about it since the power consumption at 3.4GHz is so much easier for my $50 cooler to handle than what is presented to it at 4.8GHz.

Just use the peak temperature for the hottest core during the run. Since the static power is exponentially dependent, it gives an upper-limit that turns out to be remarkably close to the average result anyways.

If you average the temperatures then you'll underestimate the total losses due to leakage, but in an outsized bigger way than how much you are over-estimating them by using the peak temperature on the hottest core.
 

BallaTheFeared

Diamond Member
Nov 15, 2010
8,115
0
71
That's how I was looking at it (cpu only).

I think when I discussed doing this with you many months ago you suggested I use fixed clock speed and fixed voltage to get the idle and load power consumption. Is that still your thought process?

Do I disable things like C7 CStates as well?
 

BallaTheFeared

Diamond Member
Nov 15, 2010
8,115
0
71
Kill-A-Watt version added to initial Amperage test, some conflicting numbers present though especially towards the top.

All power saving features were disable. The core idled at full speed and with full voltage, no downclocking or undervolting at idle.

Single 7950 used.

4.8GHz

1.325v X 147.25 = 195.1w
Idle: 89.3w Load: 196w
Max Core Temp: 82C


4.7GHz

1.265v X 128.250A = 162.23w
Idle: 85.3w Load: 176w
Max Core Temp: 74C


4.6GHz

1.2v X 109.875A = 131.85w
Idle: 81.8w Load: 158w
Max Core Temp: 66C


4.5GHz

1.145v X 96.5A = 110.49w
Idle: 76.7w Load: 145w
Max Core Temp: 61C


4.4GHz

1.1v X 85.875A = 94.46w
Idle: 74.8w Load: 135w
Max Core Temp: 58C


4.3GHz

1.055v X 76.25A = 80.44w
Idle: 73w Load: 127w
Max Core Temp: 55C


4.2GHz

1.04v X 71.875A = 74.75w
Idle: 72w Load: 123w
Max Core Temp: 54C


4.1GHz

1.02v X 67.75A = 69.1w
Idle: 71.4w Load: 119w
Max Core Temp: 52C


4GHz

1.0v X 63.5A = 63.5w
Idle: 70.6w Load: 115w
Max Core Temp: 51C


3.9GHz

0.97v X 58.25A = 56.5w
Idle: 69.5w Load: 110w
Max Core Temp: 49C


3.8GHz

0.95v X 54.75A = 52.01w
Idle: 68.6w Load: 107w
Max Core Temp: 47C


3.7GHz

0.925v X 50.625A = 46.82w
Idle: 68w Load: 100w
Max Core Temp: 44C


3.6GHz

0.895v X 46.375A = 41.5w
Idle: 67.1w Load: 97.3w
Max Core Temp: 43C


3.5GHz

0.87v X 42.625A = 37.08w
Idle: 66.1w Load: 94.5w
Max Core Temp: 42C


3.4GHz

0.86v X 39.0A = 33.54w
Idle: 52.6w Load: 91.9w
Max Core Temp: 41C
 

Durp

Member
Jan 29, 2013
132
0
0
It doesn't look like there's a "sweet spot" for power consumption to performance ratio. The lower the clock speed the better it seems.

After 4.3GHz the power consumption starts to sky rocket (not really, just compared to the lower results) though so that might be considered as a good place to stop and still have a significant increase in performance.

Thanks for sharing your results.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
That's how I was looking at it (cpu only).

I think when I discussed doing this with you many months ago you suggested I use fixed clock speed and fixed voltage to get the idle and load power consumption. Is that still your thought process?

Do I disable things like C7 CStates as well?

Kill-A-Watt version added to initial Amperage test, some conflicting numbers present though especially towards the top.

All power saving features were disable. The core idled at full speed and with full voltage, no downclocking or undervolting at idle.

For the purposes of these tests you should leave the power saving features all enabled.

It will allow you to better isolate and ascertain the non-CPU power consumption for the system components which will basically be like a fixed background offset.

It doesn't look like there's a "sweet spot" for power consumption to performance ratio. The lower the clock speed the better it seems.

That only appears to be the case because Balla isn't testing clockspeeds any lower than 3.4GHz.

You can see in my post above that even with Sandy Bridge things looked great down to 3.4GHz.

It is only as you go lower and lower in clockspeed that the performance/watt rolls off again because no matter how little power the CPU consumes, the background power being used by the GPU/RAM/Mobo/PSU/etc will start pulling down your overall system efficiency.

If you pay for the electricity, and you want to maximize your ROI, then you want to find that peak efficiency and it won't be as simple as "lower clocks are better".

And of course operating temperature matters too, so a cheaper less efficient air cooler (stock HSF) will give you a different curve with a different optimal clockspeed versus say an NH-D14 or water cooler.
 

BallaTheFeared

Diamond Member
Nov 15, 2010
8,115
0
71
Can you figure out the ROI on these clocks for me IDC?

System power consumption with all power savings features enabled and a 7950 installed is 52.6w, it doesn't matter what I do after that for clocks it's always 52.6w, which also means my 7950 uses about 10w more than the HD 4600 on the desktop at idle. This is with C7s enabled, it doesn't matter if it's reading 1.35v idle or .5v idle, power consumption doesn't change. I dunno what it's doing, but according to RealTemp it's constantly in C7 state, so my guess is it's aggressively power gating and perhaps even parking/turning off cores at idle.


We should then assume that 52.6w is platform fixed rate, correct?


Anyways, here are three results using PoV Ray.

2.8GHz (Power Savings Enabled)

Idle 52.6w Load: 75w
Max Core Temp: 38C
PoV 768 .3 Benchmark: 10m59s


4.3GHz (Power Savings Enabled)

Idle: 52.6w Load: 116w
Max Core Temp: 54C
PoV 768 .3 Benchmark: 7m09s


4.8GHz (Power Savings Enabled)

Idle: 52.6w Load: 178w
Max Core Temp: 80C
PoV 768 .3 Benchmark: 6m21s




JohnnyGuru Review of the HX850 Gold

hx850_zps34f28439.png~original

hx850g_zpsac1015b8.png~original
 

Sheep221

Golden Member
Oct 28, 2012
1,843
27
81
For the purposes of these tests you should leave the power saving features all enabled.

It will allow you to better isolate and ascertain the non-CPU power consumption for the system components which will basically be like a fixed background offset.



That only appears to be the case because Balla isn't testing clockspeeds any lower than 3.4GHz.

You can see in my post above that even with Sandy Bridge things looked great down to 3.4GHz.

It is only as you go lower and lower in clockspeed that the performance/watt rolls off again because no matter how little power the CPU consumes, the background power being used by the GPU/RAM/Mobo/PSU/etc will start pulling down your overall system efficiency.

If you pay for the electricity, and you want to maximize your ROI, then you want to find that peak efficiency and it won't be as simple as "lower clocks are better".

And of course operating temperature matters too, so a cheaper less efficient air cooler (stock HSF) will give you a different curve with a different optimal clockspeed versus say an NH-D14 or water cooler.
Technically, lower clocks are better is real. Running high frequency increases the performance but also destroys much more energy in heat and increased resistance.
Unless we could speak about superconductivity, increased frequencies will cause faster performance but also more energy wasting, which is opposite of the efficiency. Very visibly, this occurs especially in overclocking, when OC CPUs offer 10-15% performance increase yet their TDP is doubled or tripled against stock setting.
The OP is probably trying to find a good P-state level where his CPU offers the fastest performance for lower power draw.
I would advice the OP to try to downgrade under stock settings where the raw performance remains only slightly changed but the power draws and temps go down very quickly, yet the peak efficiency also depends on if he was delidding or not.

52.6W is not fixed in strict sense, it's because the CPU runs on 1.6 GHz idle P-state, if you OC and are not doing anything you still have 1.6 and same power draw because the OC frequency was not yet triggered.
 
Last edited:

jacktesterson

Diamond Member
Sep 28, 2001
5,493
3
81
So I save a Lot of power by scaling down my chip to 3.4 ghz.

The question is power to performance ratio. I'd have liked to see 4.8 ghz handbrake vs all the way down to 3.4 ghz handbrake (or some other program).

Also some numbers below 3.4 ghz. I mean, for every day tasks, you could easily get away with 2-2.4 ghz. I'm running at 2.2 ghz penryn and dont notice much of a different when using my 4770k. My 4770k prettymuch sits idle all day though. I haven't used it in weeks.


For everyday usage, I don't notice any difference using my Dell Laptop (i5-580m at 2.63 GHz) vs my Desktop (FX-8320 @ 4.3 GHz). Both running Intel 320 SSD's too.