Discussion PES | Assessing Power and Performance Efficiency of x86 CPU architectures

Page 10 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

BorisTheBlade82

Senior member
May 1, 2020
680
1,069
136
Dear Community,

so this is my first thread here as a long-time lurker - but I felt the desire to share a small hobby-project of mine from the last couple of months with you...

Performance Efficiency Suite - What is it about?
Most Reviewers solely focus on what they consider to be the most important aspect of modern CPUs - the absolute performance. But this is only one side of the equation. Today Power Efficiency is at least as important - or to be more precise: The amount of energy (Wattseconds or Joules) a CPU needs in order to accomplish a given workload. Sadly most Reviewers shy away from the extra mile it needs to assess this aspect. This suite measures the Total Package Power of a CPU while running the Cinebench R23 benchmarks first in single-threaded mode (1 run), then running in multi-threaded mode (for 10 minutes + whatever it takes to finish the last run). The results will be rendered in the provided Results.xlsx Excel file. To combine Efficiency and Performance there is also a score provided called Performance Efficiency Score (how amazingly inspired I am ;)).

In the meantime I was able to aggregate more than 80 samples from members of the 3DC & CB communities (see below).

How-To
  1. Unzip the latest release to wherever you want EXCEPT on your local OneDrive folder.
  2. Open Settings.txt and insert your local Cinebench23 Directory.
  3. Run PES Start - it will ask for Administrator rights as these are needed for measuring Package Power
  4. Wait until the Powershell finishes.
  5. Open the Excel file...
  6. Allow external connections (to the generated CSV-files with the data)
  7. Go to Data -> Refresh all
  8. Enjoy and share your results - just take a screenshot of what the Excel renders.
  9. If you want to do multiple measurements with different settings just copy the Excel file (inside the root-folder) before running and refreshing the data.

Some explanations about the Suite
  • This Suite has been made possible by Michael Möller and his amazing free and open-source Open Hardware Monitor and his .NET Library OpenHardwareMonitorLib.dll - Thanks a lot!!!
    Homepage: https://openhardwaremonitor.org/
    GitHub: https://github.com/openhardwaremonitor
  • The results for the Package Power look pretty accurate compared to the sparse data the internet provides. Seems, that the vendors are much more honest with those sensors than they are with temperature etc.
  • The suite basically consists some powershell scripts and an Excel file for presentation purposes
    • RunAsAdminWrapper.ps1
      This is needed to have a convenient relative path shortcut in the root folder and request admin-rights at the same time
    • Main.ps1
      • After setting up some stuff it basically starts the Cinebench R23 one at a time. It then checks for the "Cinebench.exe" process being active.
      • While this is true it queries the Package Power Sensor data with a lower bound of 10ms (in order to keep CPU-load of the script at bay).
      • After each run the aquired data gets pushed to CSV files located in the LogCsv subfolder.
    • Results.xslx
      • The Excel file basically just does some import, calculations and a hopefully nice presentation of the data.
      • Histogram
        The bold line shows a running average of the last 100 data-points which should be sufficiently accurate. The pale line shows each single data-point.
      • Calculation of Total Package Consumption
        To get that number we need the integral. That is why we first calculate the timeframe between two data-points and then multiply the measured value.
      • Everything else in that Excel is hopefully more or less self-explaining

Online Resources

Disclaimer
I am by no means a Powershell professional or a professional Reviewer. I was just sick of the lack of information and wanted to propose a low-effort solution. Any input for further improvement is highly welcomed. Please feel free to use/extend/rip-off this solution as you wish. But please share your findings to the world.
 
Last edited:

Timur Born

Senior member
Feb 14, 2016
300
154
116
CPU: 13900K !undervolted! + 1x 6.0 GHz single-core OC + 253W power limit (+400 A current limit) | Motherboard: Gigabyte Z790 Aero G | GPU: RTX 4090 | RAM: DDR5 Patriot 6000-36 at 5600 CR1T | Hard Drive: Adata SX8200 Pro

These are my current undervolted setting after re-enabling an AVX offset of -1. Cinebench does *not* trigger this offset (almost no realworld load does), so single-core uses 6.0 Ghz and all-core uses 5.5/4.3 GHz (like stock). Low core load in between single and all-core uses various ratios that are not caught by this kind of test here (Turbo ratios: 60/59/57/57/56/56/55/55). AVX offset triggering all-core load uses only 5.4 GHz on my cores 4+5 now, but the rest stays at 5.5 GHz. Score is a bit lower than last time I posted these, so my voltages/droop likely changed a little bit.

1703463210102.png
1703463644225.png
 
Last edited:

mmaenpaa

Member
Aug 4, 2009
91
158
106
AMD Ryzen 5 8600G 5.05GHz AM5 6C/12 65W BOX
KF560C36BWEAK2-32 (EXPO 6000, 2*16GB, 1RX8 36-38-38 1.35V)
Asrock B650M Pro RS (bios 2.08.AS01)
(it seems that just arrived 2.08, Update AGESA to ComboAM5 1.1.0.2b), I'll update and rerun when I have time
Windows 11 Pro

1709828996572.png

1709829043315.png
EDIT:
Bios 2.08, ST CB seems better?


1709841028444.png


1709841059141.png
 
Last edited:

BorisTheBlade82

Senior member
May 1, 2020
680
1,069
136
AMD Ryzen 5 8600G 5.05GHz AM5 6C/12 65W BOX
KF560C36BWEAK2-32 (EXPO 6000, 2*16GB, 1RX8 36-38-38 1.35V)
Asrock B650M Pro RS (bios 2.08.AS01)
(it seems that just arrived 2.08, Update AGESA to ComboAM5 1.1.0.2b), I'll update and rerun when I have time
Windows 11 Pro

View attachment 94976

View attachment 94977
EDIT:
Bios 2.08, ST CB seems better?


View attachment 94983


View attachment 94984
Hey @mmaenpaa , thanks a lot for the contribution as always. Especially the ST score comes as a negative surprise. Did you use CB23 or were you using CB24 by accident?
 

BorisTheBlade82

Senior member
May 1, 2020
680
1,069
136
Wow, that is surprising. So going by the numbers, Phoenix uses almost twice as much energy for the ST run as a 5600G Cezanne (18000 Joules vs. 10000 Joules). Even taking into account higher frequency and performance that is a far cry considering the node jump between the two.
 

tamz_msc

Diamond Member
Jan 5, 2017
3,865
3,729
136
@BorisTheBlade82 I've ran the v0.8 edition of the script that's available on GitHub, and my results indicate that performance per watt in isolation is completely meaningless without specifying the power level at which you are measuring the performance. Your efficiency score calculation method captures this fact.
 

moinmoin

Diamond Member
Jun 1, 2017
5,064
8,032
136
Wow, that is surprising. So going by the numbers, Phoenix uses almost twice as much energy for the ST run as a 5600G Cezanne (18000 Joules vs. 10000 Joules). Even taking into account higher frequency and performance that is a far cry considering the node jump between the two.
Much of it may well be the (hopefully only initial, with optimizations down the road) platform power usage of AM5, using DDR5 etc. In that regard it would be nice to have mobile results for Phoenix for comparison.
 
Jul 27, 2020
20,039
13,737
146
Wow, that is surprising. So going by the numbers, Phoenix uses almost twice as much energy for the ST run as a 5600G Cezanne (18000 Joules vs. 10000 Joules). Even taking into account higher frequency and performance that is a far cry considering the node jump between the two.
EXPO cranks up voltages. Hope @mmaenpaa can run the test again at stock DDR5-5200 with curve optimizer.
 

BorisTheBlade82

Senior member
May 1, 2020
680
1,069
136
@BorisTheBlade82 I've ran the v0.8 edition of the script that's available on GitHub, and my results indicate that performance per watt in isolation is completely meaningless without specifying the power level at which you are measuring the performance. Your efficiency score calculation method captures this fact.
I am not sure I understand what you are trying to say. If you want to say that one and the same CPU can have highly different scores depending on the applied wattage, then yes, you are right.
So is this a statement about some kind of fairness? Do you suggest rankings for certain wattage areas?
In the end, that whole thing was started by me out of curiosity and not with some agenda in mind.
 

BorisTheBlade82

Senior member
May 1, 2020
680
1,069
136
Much of it may well be the (hopefully only initial, with optimizations down the road) platform power usage of AM5, using DDR5 etc. In that regard it would be nice to have mobile results for Phoenix for comparison.
This is only guesswork, but I cannot believe the DDR5 controller to have such an impact. Take the R5 7600X for example. I guess we can all agree that the IOD will have a much bigger uncore consumption than Phoenix should have. And even that one only consumes 11 kJ. This looks very fishy and I agree with you that a Mobile Phoenix sample might be of help.
 

tamz_msc

Diamond Member
Jan 5, 2017
3,865
3,729
136
I am not sure I understand what you are trying to say. If you want to say that one and the same CPU can have highly different scores depending on the applied wattage, then yes, you are right.
So is this a statement about some kind of fairness? Do you suggest rankings for certain wattage areas?
In the end, that whole thing was started by me out of curiosity and not with some agenda in mind.
What I'm saying that performance per watt - where you measure the performance in dimensions of 1/time and divide it by average power consumption during the measurement interval - does not necessarily reflect if processor A is more efficient than processor B, unless you also specify the power level at which you are doing the measurement.

Because the performance-power curve of A and B when overlapped on the same graph will have at least one intersection throughout the range of power under consideration.

To illustrate, instead of this:
1710066707834.png
it is much more likely that you will have this:
1710066743085.png
 

BorisTheBlade82

Senior member
May 1, 2020
680
1,069
136
Quite interesting result. I was looking at duration times and it seem that 8500G in this case is faster / better in CB23 than 8600G. On the other hand Geekbench is slower.

What kind of cooling did you have on this run?

My 8600G uses AMD Boxed which comes with it.
I strongly suspect that there is something entirely not right with your system looking at the ST figures alone. 40w avg. vs 24w for @Shivansps can't be explained - even if boost clocks might differ a littlebit. Maybe some quite heavy background processes?
The MT comparison looks consistent IMHO - 700Mhz difference in base clock.
 

mmaenpaa

Member
Aug 4, 2009
91
158
106
LENOVO L14 G5 R5-7535U/14WUXGA/16GB/512SSD/W11P/3OS
21L5001LMX
Single channel DDR5-4800
Windows 11 Pro installed, all drivers via Windows Update only (I prefer now a days use only WU, check with manufacturer tools from time time)
One oddity when using Lenovo System Update, it offered only AMD S0I3 Filter Driver for Windows 11, version 1.0.0.19. I did install it and for some reason efficiency tanked on CB23. So I removed it. This result is in the last image (AC CB23 only)

AC:
L14G5_AC_CB23.png
L14G5_AC_GB5.png
DC:
L14G5_DC_CB23.png
L14G5_DC_GB5.png

AC, AMD S0I3 Filter driver 1.0.0.19 installed:
AMD S0I3 Filter.png
 
Jul 27, 2020
20,039
13,737
146
@mmaenpaa , what's with the single channel RAM? You do realize that the MT scores will be lower as a result due to cores being bandwidth starved so they will spend more time waiting and being idle which will also result in lower power usage.
 

mmaenpaa

Member
Aug 4, 2009
91
158
106
@mmaenpaa , what's with the single channel RAM? You do realize that the MT scores will be lower as a result due to cores being bandwidth starved so they will spend more time waiting and being idle which will also result in lower power usage.
It is not always my decision. Customer needs come first (budget+time was critical also). This model comes with one 32GB sodimm. To get dual channel one needs to put one 32GB "original" Lenovo SODIMM > 300€. General/compatible Kingston would be cheap (<100€) but we need time to test those. I have been bitten too many times believing standard ram works with brand laptops

Also in this case this laptop has discrete GPU (RTX A1000) which does not depend on memory bandwidth.

We can always upgrade later :)
 

mmaenpaa

Member
Aug 4, 2009
91
158
106
Weird. Most of them Lenovo laptops I suppose? Kingston generally works great. Never had an issue (or never had the bad luck of getting a defective stick).
Actually it was HP Probook 6465b way back (2013). I sold quite a few of those and installed a second 4GB sodimm (Kingston). Notebook worked well but sometimes it did not wake from sleep. All memorytest etc. passed with flying colors. As soon I put "original" HP instead problems stopped. Of course those notebooks were all over country as my customer had several locations. Fun times 🤣
 
  • Like
Reactions: igor_kavinski