Performance/Watt comparison

Denithor

Diamond Member
Apr 11, 2004
6,298
23
81
Phenom II X4 965 article

Dividing the x264 Transcode FPS by the watts used by each chip in the same test yields some interesting performance/watt results.

i7 965 (31.6/217) 0.146
i7 920 (26.7/203) 0.132
Q9650 (19.9/196.5) 0.101
Q9400 (17.9/179.8) 0.100
X4 965 (20.3/223) 0.091
Q8200 (15.4/170.9*) 0.090
X4 955 (19.1/220) 0.087
Q6600 (15.2/193.3) 0.079
X3 720 (11.8/183.5) 0.064
E7200 (8.4/146.1) 0.057

So the i7 965 is 60% more efficient than the X4 965. The AMD chips barely match the cache-restricted Q8x00 series (and note - I had to use the Q8400 watt number with Q8200 performance because AT didn't standardize chips they put on these two chards - lots of holes in the lineup if you compare them).
 

KingstonU

Golden Member
Dec 26, 2006
1,405
16
81
I don't think anyone will disagree with this. However wouldn't idle power consumption be more important that load? I mean how what percentage of the day are most people's computers at 100% load? For the most part, the only time that mine is being loaded (and even then not at 100%) is when I'm playing games. There are other odd programs that I use here and there that do so as well, not so much since I finished university now.

So I play games maybe 2 hours/day so let's say 25% if computer is on 8 hours in a day. most of the time I don't have my computer on till after I get home from work, so let's say it is on 6 hours in a day and I play for 2 hours. 33% at maybe ~80% load, and 66% at idle or near idle.

So to me looking at idle power consumption is more relevant. The only thing I will look at for load is for cooling requirements.

At idle, the 965 x4 consumes 2.6 watts more than the i7 920, and 3.2 watts more than the i7 975. AT Link
 

imported_Scoop

Senior member
Dec 10, 2007
773
0
0
I think that's the peak load power measured so it's not completely valid. You need to measure the average consumption and compare it to the performance.
 

aigomorla

CPU, Cases&Cooling Mod PC Gaming Mod Elite Member
Super Moderator
Sep 28, 2005
21,126
3,653
126
Not fair because i7 has 8 virtual threads, which means more work gets done at the cost of same TDP.

So yeah, the i7 is of course gonna have the highest value..

turn off HT and see what u get.
 

Forumpanda

Member
Apr 8, 2009
181
0
0
I assume the power usage listed is total system power usage.

Total system power used will heavily favor the faster chip.
If a chip is twice as fast but uses twice the power (for the chip) .. performance will increase by 100% but total power consumption by less (obviously).
So the performance / total power metric for that system comes out more favorably.

This of course does apply to real life, a faster chip means you need fewer systems which means less total power consumption, but taking total system power useage and using that metric to calculate chip efficiency seems off.

I don't see HT as unfair though, that is what the chip does, that would be like saying its unfair for the i7 that the AMD chips run a higher clockspeed (imo)
 

bradley

Diamond Member
Jan 9, 2000
3,671
2
81
Even servers in data centers aren't designed to run at peak load 24/7. I'd also be interested to see underclocking performance included. In Windows XP, I was able to undervolt my X2 5000+ to as low as 700MHz .7V, which made a big difference in power consumption. AMD's Cool 'n' Quiet otherwise would have the CPU stepped up to 1GHz 1.1V at idle.

Maybe I'm too realistic to ever put much creedence into today's reviews and reviewers. They never reflect my needs or the way I use my computer. Just as for some HT or quad-core is an unnecessary feature, even though they might think or be told otherwise. And getting either one for free is a lot different than paying premium. I just don't believe cost of ownership is that much of a deciding factor between PII vs. C2D vs. i7 at the moment.
 

richierich1212

Platinum Member
Jul 5, 2002
2,741
360
126
Even increasing uncore speed (NB speed) on the Phenom II cpus yields better results (and you don't even have to bump up the voltage really), especially in encoding programs. Why don't reviewers mention that?
 

Denithor

Diamond Member
Apr 11, 2004
6,298
23
81
Originally posted by: KingstonU
However wouldn't idle power consumption be more important that load? I mean how what percentage of the day are most people's computers at 100% load? For the most part, the only time that mine is being loaded (and even then not at 100%) is when I'm playing games. There are other odd programs that I use here and there that do so as well, not so much since I finished university now.

Originally posted by: Scoop
I think that's the peak load power measured so it's not completely valid. You need to measure the average consumption and compare it to the performance.

Originally posted by: bradley
Even servers in data centers aren't designed to run at peak load 24/7.


^ Tell that to my machines that run F@H 24/7/365. And to all the other folders (and various DC supporters) on the forums here.


Regarding HT - as Forumpanda said, I think it's fair game to leave HT included. HT basically is just a queueing scheme that lets Intel chips crunch through data in a more efficient manner (lines up work ahead of time for each core so there is very little downtime when work is available). And if I recall correctly having HT enabled has a definite impact on power consumption - the cores stay at a higher load level than without it so the power consumption goes up appreciably.
 

Forumpanda

Member
Apr 8, 2009
181
0
0
I also want to add you can fix the comparison between 2 chips by adding idle power consumption for the time difference between the 2 systems, to the fastest system.

Thus if you add idle power consumption of however much time they completed the work before the slowest chip, then you will have a better measurement of efficiency.

And as others pointed out this is why reviews and specially efficiency reviews such as power usage, encoder comparisons etc. really aren't done well.
They usually measure different systems doing a different amount of work and call it a comparison. (in this case the different work is that the fast systems don't have to be 'on' for the entire duration of the test).

For encoders where people usually fail is that they measure different encoders producing different output compare them for speed.
In other words, it is hard making a statistically sound comparison of anything.
 

daw123

Platinum Member
Aug 30, 2008
2,593
0
0
So what would be the best comparative test:

The system would have to be the same or as close as possible between architectures; the only differences between each system (LGA775, LGA1366 and AM3) would be the CPU, RAM and MB. Everything else could be the same.

So, would you use 4GB (2x 2GB modules) of DDR3 (low voltage and normal voltage DIMMs), so the same amount of RAM with the same timings can be used.

The MBs, using Gigabyte as an example to keep them as equivalent as possible, could be:
LGA775: GA-X48T-DQ6 (DDR3 normal voltage RAM).
LGA1366: GA-EX58-UD5 (DDR3 low voltage RAM).
AM3: GA-MA790FXT-UD5P (DDR3 RAM).

Then would you measure each CPU over a specified time to perform a task (such as encoding), so you get the total power output (in watts)? Presumably, this would favour the 'faster CPUs', since they would complete the task quicker and be idle for longer than the slower CPUs.

Or average the total power output of the system for a given task (such as encoding) over the time it takes to complete such a task, so you get a watts per second rating (or equivalent)?

What do you guys think?
 

Forumpanda

Member
Apr 8, 2009
181
0
0
Originally posted by: daw123
Then would you measure each CPU over a specified time to perform a task (such as encoding), so you get the total power output (in watts)? Presumably, this would favour the 'faster CPUs', since they would complete the task quicker and be idle for longer than the slower CPUs.

Or average the total power output of the system for a given task (such as encoding) over the time it takes to complete such a task, so you get a watts per second rating (or equivalent)?
This is backwards, the first comparison is the most fair in terms of overall power usage for a given task, no computers get turned on and off between tasks.
It is actually slightly disfavoring the fastest CPU *in a business environment* where faster system can mean you have fewer systems (thus less idle time).

The second comparison is the one which favors the faster CPU much more, if you do not count idle power usage then essentially you are giving the fast system an idle power usage of 0 which is obviously not obtainable in a real world situation.
 

daw123

Platinum Member
Aug 30, 2008
2,593
0
0
Originally posted by: Forumpanda
Originally posted by: daw123
Then would you measure each CPU over a specified time to perform a task (such as encoding), so you get the total energy use (in joules)? Presumably, this would favour the 'faster CPUs', since they would complete the task quicker and be idle for longer than the slower CPUs. This is the first test in my example below.

Or average the total power output of the system for a given task (such as encoding) over the time it takes to complete such a task, so you get a joules per second rating (or power rating)? This is the second test in my example below.
This is backwards, the first comparison is the most fair in terms of overall power usage for a given task, no computers get turned on and off between tasks.
It is actually slightly disfavoring the fastest CPU *in a business environment* where faster system can mean you have fewer systems (thus less idle time).

The second comparison is the one which favors the faster CPU much more, if you do not count idle power usage then essentially you are giving the fast system an idle power usage of 0 which is obviously not obtainable in a real world situation.

I apologise that I used the wrong SI units in my previous post, so I've shown the corrections above in bold.

The first test would include idle energy consumption, otherwise what would be the point.

Therefore, as a couple of examples:

If computer X and Y have a task to complete where:

Computer X has a power use 140W (or J/s) at load, 70W (or J/s) at idle. (the 'faster CPU / system')
Computer Y has a power use of 120W (or J/s) at load, 50W (or J/s) at idle. (the 'slower CPU / system')
The above assumes that the 'faster CPU /system' uses more power at load and idle than the 'slower CPU / system'

And if computer X takes 5 seconds to complete 'a task' at full load (second test):
Total @ load = 140J/s * 5s = 700J
Average energy used = 700J / 5s = 140W

Measured over a duration of time, say 20 seconds (first test):
Total @ load = 700J
Total at idle = 70J/s * (20s - 5s) = 1050J
Total energy used = 700J + 1050J = 1750J

Computer Y takes 10 seconds to complete 'a task' at full load (second test):
Total @ load = 120J/s * 10s = 1200J
Average energy used = 1200J / 10s = 120W

Measured over a duration of time, say 20 seconds (first test):
Total @ load = 1200J
Total at idle = 50J/s * (20s - 10s) = 500J
Total energy used = 1200J + 500J = 1700J

Where 1W = 1J per 1s.

The above are arbitrary figures, which I made up. Its interesting that the slower CPU wins in both cases; I better check the maths for cock ups :D
 

cusideabelincoln

Diamond Member
Aug 3, 2008
3,275
46
91
The best solution is to determine the amount of energy used to perform the specified task. And for this, you need to have the averaged power consumption of the chip/system from beginning until the end of the test, and you'll need to know the exact time it takes to complete the test.

And of course it's also best to test more than one application, as applications can favor one processor or may not fully stress another one. Using multiple applications will give a better picture.

For most reviews, like AT's, I only look at the power consumption numbers as a starting point and as a way to estimate the best fit power supply.
 

drizek

Golden Member
Jul 7, 2005
1,410
0
71
My CPU is idle most of the time. If I feel like folding(it is summer and hot as hell over here, so no) then I use the GPU, which is far more efficient.
 

Leyawiin

Diamond Member
Nov 11, 2008
3,204
52
91
An i7 965 is at least $600 more than an X4 965. Yeah, it should be better. You can cherry pick an app to show a wide discrepancy in perceived value, but gaming is my concern and there is no way I would ever make up the price difference in Intel CPUs and mobo vs. AMD by power savings. The difference in gaming is small and oftentimes in favor of AMD (in comparable classes).
 

Forumpanda

Member
Apr 8, 2009
181
0
0
Well you gave the slower CPU fairly advantageous idle power consumption, the point is more that all systems general have near the same idle consumption, so you can stick to measuring the idle time within the testing time as additional idle time isn't going to affect to results much. (not to the point that they become any more or less meaningful)

The take home point is that when it comes to make a system to system comparison of the actual amount of power used the best we can do is approximate a result, and right now the way it is done in most reviews (measuring max load) .. or the way people interpret those results (max load * duration) doesn't really give all that meaningful data.

For a single user the example you gave is a good measure. For a business the waters get a bit more muddy because there is additional value in a faster system as it may mean less systems is required, thus we cannot just assume that doing a task faster means more idle time.
So we would have to assign some value to speed, but not accounting for extra idle time if a task is completed faster just leads to meaningless results.

Anyway I'll go back to lurk mode, just wanted to rant on how poorly I feel most power consumption and en- / trans-coder reviews are written, when it comes to these things.
 

imported_Scoop

Senior member
Dec 10, 2007
773
0
0
Originally posted by: Leyawiin
An i7 965 is at least $600 more than an X4 965. Yeah, it should be better. You can cherry pick an app to show a wide discrepancy in perceived value, but gaming is my concern and there is no way I would ever make up the price difference in Intel CPUs and mobo vs. AMD by power savings. The difference in gaming is small and oftentimes in favor of AMD (in comparable classes).

Unless you like multi-GPU setups, which means SLI 'cause unfortunately Crossfire isn't on the same level. And multi-GPU is the reason to have an X58 system.

 

Denithor

Diamond Member
Apr 11, 2004
6,298
23
81
Originally posted by: Leyawiin
An i7 965 is at least $600 more than an X4 965. Yeah, it should be better. You can cherry pick an app to show a wide discrepancy in perceived value, but gaming is my concern and there is no way I would ever make up the price difference in Intel CPUs and mobo vs. AMD by power savings. The difference in gaming is small and oftentimes in favor of AMD (in comparable classes).

That's not valid reasoning at all. Back in the days of the Prescott chips Intel was higher priced and by no means competitive on power consumption.

Plus - did you even bother to read all the numbers or just look at those two chips? Because the Q9400 also manages to beat the X4 965 by more than 10% in the same test. Meaning that AMD's chips aren't even really competitive from a power efficiency standpoint with Intel's previous generation chips. Which is why I said they are lagging badly behind.

Anyway, what I did in the OP is simply look at how efficiently the various chips can crunch through a video encoding exercise. That's the only data point AT provided in the article so I have no other benchmarks to consider (and video work has always been one of Intel's strong points).

This does not look at total power consumption but rather just at the efficiency while the chip is doing its work. For total power consumption you guys are right - you gotta consider both the working time as well as the idle time (while the other chips are finishing the work unit). So if you have a work unit that takes your slowest chip 22 minutes to complete and your fastest chip will do it in 12 minutes you have to add 10 minutes of idle power consumption to the fastest chip to calculate total power consumed.

AT published another article in January that is closer to what you guys are looking at - total power consumption. But even in this article they didn't adjust for the idle time of the faster systems versus the slower so the numbers are a bit more spread out than they should be (ie the i7 systems rule by a wide margin because they finish the work much faster than the other machines and AT reports "total power consumed during the benchmark run").
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
While rummaging around for a review I remember reading that addressed your comments I came across this snippet on AutoCad performance and wanted to link to it since that is a semi-routine question here. I was pleasantly surprised to see that monolithic PhII chip really delivering an advantage in AutoCad:

However, when we get to AutoCAD application, Phenom II X4 965 does more than good: it is not just 30% faster than Core 2 Quad of the similar price but even outperforms a more expensive and more technologically advanced Core i7 CPU.

http://www.xbitlabs.com/articl...i-x4-965_10.html#sect0

Now then, about the meat of the thread:

Originally posted by: Denithor
This does not look at total power consumption but rather just at the efficiency while the chip is doing its work. For total power consumption you guys are right - you gotta consider both the working time as well as the idle time (while the other chips are finishing the work unit). So if you have a work unit that takes your slowest chip 22 minutes to complete and your fastest chip will do it in 12 minutes you have to add 10 minutes of idle power consumption to the fastest chip to calculate total power consumed.

AT published another article in January that is closer to what you guys are looking at - total power consumption. But even in this article they didn't adjust for the idle time of the faster systems versus the slower so the numbers are a bit more spread out than they should be (ie the i7 systems rule by a wide margin because they finish the work much faster than the other machines and AT reports "total power consumed during the benchmark run").

If I interpret your post correctly, what you want to see are power-consumption traces like the ones they do at the TechReport, with subsequent energy consumption per work unit analysis.

Checkout this page of The Tech Report's article on AMD's Phenom II X4 965 Black Edition processor...the "boil it down" to task energy, a useful metric for then throwing into cost analysis efforts.

Now Techreport only characterizes/reports the task energy profiles with CineBench...but what we'd like to see is the task energy for computing a whole host of applications.

Much as Anand did in this article for Nehalem, which I subsequently reduced to basically the same thing as task energy in this post.

Originally posted by: Idontcare
Thanks, I appreciate the sanity check, so I'm not entirely off-base on this I guess.

Still though I like the new power numbers that Anand published. It's actually quite a nice showing for Nehalem.

Not sure why nobody actually crunches the data into performance/watt metrics anymore, guess its not sexy enough anymore. It's so 2007.

I went ahead and crunched Anand's data to convert it to performance/watt:

CPU...................................QX9770 (3.2GHz)..........Core i7-965 (3.2GHz).............Improvement
POV-Ray..............................11.4 PPS/Watt..............17.5 PPS/Watt......................53%
Cinebench (1 thread)............20.3 CBMarks/Watt.......26.6 CBMarks/Watt...............31%
Cinebench (max threads)......61.8 CBMarks/Watt.......81.5 CBMarks/Watt...............32%
3dsmax 9 SPECapc CPU........0.060 /Watt..................0.084 /Watt..........................41%
x264 HD Encode Test............0.32 fps/Watt................0.44 fps/Watt.......................38%
DivX 6.8.3............................2.61 Watts...................1.84 Watts............................29%
Windows Media Encoder........2.01 Watts....................1.34 Watts............................33%
Age of Conan.......................0.35 fps/Watt................0.46 fps/Watt........................31%
Race Driver GRID.................0.30 fps/Watt...............0.34 fps/Watt........................15%
Crysis..................................0.14 fps/Watt...............0.16 fps/Watt........................15%
FarCry 2..............................0.32 fps/Watt................0.42 fps/Watt........................34%
Fallout 3...............................0.25 fps/Watt...............0.37 fps/Watt........................45%

Unless I made a mistake in the math the i7 beat the QX9770 in every test. The average percent power consumption reduction per unit of work being done is 33% for the i7 over yorkfield.

Now I am finally seeing the 30-40% power consumption reduction numbers I was expecting once performance is normalized :D Me much happier now!
 

drizek

Golden Member
Jul 7, 2005
1,410
0
71
That is nice for benchmarking purposes, but the question is, can you actually get an i7/5/3 that is the same speed as a yorkfield but with 33% lower power consumption(at both idle and load)?
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Originally posted by: drizek
That is nice for benchmarking purposes, but the question is, can you actually get an i7/5/3 that is the same speed as a yorkfield but with 33% lower power consumption(at both idle and load)?

By "same speed" do you mean clockspeed or performance?

If you want a Nehalem at the same clockspeed as yorkfield then you are going to get higher performance for about the same power-consumption.

If you want Nehalem at the same performance as yorkfield then you will get lower power consumption for about the same performance.

Speed and idle power-consumption have weird implications to me...personally if my computer is idle, ever, then I bought more computer speed than I needed as I clearly am unable to saturate its computing capability. I buy computers to run 24/7, or I shut them off when I am not using them.

So being concerned with how fast my computer is while it is doing nothing just makes me feel weird to contemplate, but I in no way think everyone feels this way about idle computers. No doubt there are valid reasons to leave your desktop computer running a screensaver instead of slipping into standby or hibernate.
 

drizek

Golden Member
Jul 7, 2005
1,410
0
71
I meant performance. Does it directly translate to 33% lower power consumption at hte same performance level, or is nehalem only able to get better performance/watt at the high end?

I am asking because I would rather have a Penryn with 33% lower power consumption in a laptop than a nehalem with the same power use and 33% better performance.
 

Denithor

Diamond Member
Apr 11, 2004
6,298
23
81
The Nehalem chip is probably going to be much more efficient. Remember that with this line Intel more or less perfected the ability to turn off individual cores when not in use (so at idle these machines can be extremely power efficient).

If you look at the source of IDC's data above you will see an interesting number when they compared a QX9770 (3.2GHz) system to an equivalent i7 965 system. The idle power for the Penryn system was 138.7W while the Nehalem machine idled at only 105.5W. IDC did the division for you above concerning how the two stacked up under load (Nehalem winning by 15% in the worst cases, 40% or higher in the best cases).

Now, one other thing that occured to me while writing this: in a home environment you might encode a movie and then the machine sits idle for a while before you start the next task. In a business environment, however, as soon as the first task is complete it will immediately launch into a second and third and so on. The machine really doesn't have idle time - rather, the true benefit of these chips is that you can get more work done in the same amount of time and therefore you don't need as many machines to get everything done. That is where you save some serious power & therefore cash - when one computer can handle the load you used to have to share among two or three, even if it uses more power than any single former machine you still save power overall.

And if you seriously look at those charts you will see that these chips do both: they get more work done while simultaneously consuming less power (in 11/13 benchmarks this held true).
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Originally posted by: drizek
I meant performance. Does it directly translate to 33% lower power consumption at hte same performance level, or is nehalem only able to get better performance/watt at the high end?

I am asking because I would rather have a Penryn with 33% lower power consumption in a laptop than a nehalem with the same power use and 33% better performance.

Well you really can have it either way, it just depends what your reference point is. Lower clocked yorkfields and lower-clocked Nehalems exist (or will soon), as well as their LV and ULV variants...really at that point in the decision tree you are at the mercy of your budget (LV and ULV equals more expense for even better performance/watt) as well as the system integrators actually going to the effort of coupling your ULV chip (penryn or nehalem) with a decent enough battery.

My DELL X200 with a ULV P3 800MHz fetched me ~2.5hrs battery life...years later my DELL inspirion 1525 with dual-core core2duo chip fetches me the same uninspiring 2.5hr battery life. What I would have liked to have is my existing battery pack paired with something like an Atom (for my purposes) which should fetch me some 8-10hrs battery life.

To get back to your question though, the point is there will be a gamut of SKU's for both penryn and nehalem...so depending on which penryn you compare to which nehalem you will either be looking at equivalent performance but better battery life (provided they don't cut down the battery pack) with the nehalem or you will be looking at equivalent battery life but higher performance on the nehalem.

The difference somewhere in there will be the pricetag.
 

dmens

Platinum Member
Mar 18, 2005
2,275
965
136
Originally posted by: Idontcare
Checkout this page of The Tech Report's article on AMD's Phenom II X4 965 Black Edition processor...the "boil it down" to task energy, a useful metric for then throwing into cost analysis efforts.

the method used by techreport is very similar to the principal metric nehalem used to assess architectural changes. task energy directly correlates with power per instruction, which is the most accurate measure of dynamic efficiency.