Question Zen 2, clock speed vs power consumption vs ppd sweet spot?

Assimilator1

Elite Member
Nov 4, 1999
24,120
507
126
I finally got the adapter brackets for my ancient TR Ultima90 to go onto to my 3600 the other day :).
With the reduced CPU PPT of 65w I had set for the stock cooler, temp's reduced by ~15C with LHC and Rosetta! (vs the stock Stealth POS cooler, same ambient temp's).

Anyway, now that I've got the thermal headroom I've put the PPT back to stock (auto, max 88w), and power consumption at the wall has gone up to ~158w from ~129w running Rosetta (same WUs, approx 3.725-3.75 GHz, VID ~1.14v to ~3.9 GHz, VID ~v1.3).
CPU temp ~60C with 65w PPT, 75-76C with 88w PPT. So that's a 22.5% power increase for a 4% clock speed increase! Err, that's not worth it! (3.75 to 3.9 GHz).

And to ~148w from ~130w running LHC (avx WUs) (~3.825 GHz ~1.2v VID, to 3.925 GHz, VID ~1.32v), CPU temp ~63C with 65w PPT, ~76C with 88w PPT. And here that's a 13.8% power increase for a 2.6% clock speed increase, not as bad, but still not worth it!

So I was wondering what you guys had found was the sweet spot between power setting/clock rate & power consumption/ppd?
Btw, anyone know why more power is used when running Rosetta vs LHC but lower vcore & clock speed at the default auto PPT setting?? [edit] I guess I'm hitting some other limit with Rosetta?

Auto PPT (max 88w. 11w over previous setting, +14.3%)(23w, +35.4% over 65w PPT setting)
LHC wall draw 148w, 3.925 GHz, VID ~1.32v, CPU 76C, +13.8% power usage for a 2.6% clock speed increase vs 65w PPT.
Rosetta wall draw 158w, 3.9 GHz, VID ~1.3v, CPU 75-76C, +22.5%(wth!?) power usage for a 4% clock speed increase vs 65w PPT.
HWiNFO PRD CinebenchR20 ~99%, score 3499 (+5% over 65w PPT setting).

Trying in between PPT limits....

77w PPT (5w over previous setting, +6.9%.)(12w, +18.5% over 65w PPT setting)
LHC wall draw 144w, 3.9-3.925 GHz, VID 1.3v, CPU temp 73C.
Extra 10.8% power usage for a 2% clock speed increase vs 65w PPT.
Rosetta wall draw 145w, 3.85 GHz, VID 1.23v, CPU temp 68C.
Extra 12.4% power usage for a 2.7% clock speed increase vs 65w PPT.

72w PPT (7w over previous setting, +10.8%)
LHC wall draw 137w, 3.875 GHz, VID ~1.26v, CPU temp 69C.
Extra 5.4% power usage for a 1.3% clock speed increase :rolleyes: vs 65w PPT.
Rosetta wall draw 138w, 3.8-3.825 GHz, VID ~1.2v, CPU temp 65C.
Extra 7% power usage, also for a 1.3% clock speed increase (3.75 vs 3.825 GHz) vs 65w PPT.

65w PPT (12w over previous setting, +23%)
LHC wall draw 130w, 3.825 GHz, VID ~1.2v, CPU 63C.
Extra 13% power usage for a 4.1% clock speed increase vs 53w PPT.
Rosetta wall draw 129w, 3.725-3.75 GHz, VID ~1.14v, CPU 60C.
Extra 12.2% power usage for a 4.2%+ clock speed increase vs 53w PPT.
CinebenchR20 score 3327

53w PPT
LHC wall draw 115w, 3.675 GHz, VID ~1.1v, CPU 55C
Rosetta wall draw 115w, 3.575 GHz, VID ~1.03v, CPU 52C

Ambient 22C throughout. PSU is a 2009 Corsair TX650 with a basic 80+ eff rating (replacing soon). Mains 230v. All fan speeds fixed except the PSU.
Wall draw figures are a visual average over about 1/2 a minute. Clock speed, VID & CPU temps from HWiNFO64 v6.27.4185 beta. Win 10 64bit...... think that covers it! ;)

In the end I stuck with the 65w PPT setting as it gave me much lower temps (13-16C!), cut power consumption by 12-18%, with only a 2.6-3.8% cut in all core loaded clock speed (all for LHC & R@H) vs Auto/88w PPT. The only benchmark I did was with CinebenchR20 (default settings), this showed just a small loss of performance of 4.9%.
 
Last edited:
  • Like
Reactions: biodoc

Fardringle

Diamond Member
Oct 23, 2000
9,184
753
126
I'm not sure about the 3600, but I can tell you what I did on my 3900X.

At stock/max boost it was sitting at 142W all the time, and getting quite hot.

I tried a bunch of power limits between the base 105W and the max 142W, and none of them gave me good results. The higher end of the range had virtually no effect on temperatures or actual power usage, and the lower range did reduce the temps a bit but started getting odd pauses while using the computer.

I set the package power limit to the base 105W, and CPU temperatures are about 10-15C lower than at 142W. Every benchmark test I tried gave exactly the same CPU scores on single core tests, and within 5% of the 142W scores on tests using all cores. The CPU clock doesn't boost as high when using all cores, but it stays boosted all the time instead of bouncing around, and performance is still very good.
 

Assimilator1

Elite Member
Nov 4, 1999
24,120
507
126
The 142w, is that wall draw or CPU PPT figure? (I'll clarify my op, mine's wall draw btw where it didn't say PPT).

Trying 75w PPT now, will be adding info to the op - done, next 53w PPT.
 
Last edited:

StefanR5R

Elite Member
Dec 10, 2016
5,459
7,717
136
Here are figures for EPYC 7452, running TN-Grid's gene@home application: Going from the default PPT of 155 W to the maximum possible 180 W (+16 %) gave +8 % performance at +19 % host power consumption, thus -9 % power efficiency.

One EPYC 7452 is like four Ryzen 3700X or 3800X welded together. Hence, 155 W / 4 ≈ 40 W should be a good PPT for an 8-core Ryzen which is not used for desktop duties but for 24/7 bulk computing.
 
  • Like
Reactions: Assimilator1

Assimilator1

Elite Member
Nov 4, 1999
24,120
507
126
Useful info, thanks :)
So an equivalent PPT for a 6 core 3600 would be 30w! Lol, that is very low! I think that would impact gaming quite a bit though ;), interesting to see the differences on the power/clock speed curve between a server CPU & a consumer CPU :thumbsup:.
 
Last edited:

Assimilator1

Elite Member
Nov 4, 1999
24,120
507
126
Yea, very useful :), I bought mine about 15yrs ago I think!

I think I'm going to run at the 65w PPT setting, seems to be the best compromise, & I haven't had any issues playing Elite Dangerous at that speed.
Funny(?) to see that going from 65w PPT to 88w PPT, that power usage goes up ~14 - 22%, yet clock speed only goes up about 2.6 - 4%! :eek:
 
Last edited:

Fardringle

Diamond Member
Oct 23, 2000
9,184
753
126
Funny(?) to see that going from 65w PPT to 88w PPT, that power usage goes up ~1/4 to 1/3, yet clock speed only goes up about 5-7%! :eek:
Yep. I was pretty surprised to see only about a 5% difference in benchmarks between 105W and 142W on my 3900X.
 

StefanR5R

Elite Member
Dec 10, 2016
5,459
7,717
136
One EPYC 7452 is like four Ryzen 3700X or 3800X welded together. Hence, 155 W / 4 ≈ 40 W should be a good PPT for an 8-core Ryzen which is not used for desktop duties but for 24/7 bulk computing.
Here I (1.) assumed that the processor firmware of Ryzen would apply a similar Voltage-frequency relationship as the Epyc firmware, but I don't know if it actually does, and (2.) neglected that Ryzen based computers usually have more peripheral devices attached per CPU core than compute-oriented Epyc servers. E.g. each Ryzen has got a southbridge chip attached, Epyc computers don't have southbridges. This matters if the power efficiency relationship vs. processor power limit is considered at the whole system, not just at the processor in isolation.

So an equivalent PPT for a 6 core 3600 would be 30w! Lol, that is very low!
The 8-core Ryzen and 6-core Ryzen both have the same size of I/O die though, and both have two DRAM channels attached, etc..
 
  • Like
Reactions: Assimilator1

Assimilator1

Elite Member
Nov 4, 1999
24,120
507
126
Any idea how much power the SBs take?

Btw, any thoughts as to why Rosetta is getting a lower clock speed with a lower VID yet (mostly) consuming more power than LHC?? Especially on auto PPT!
I wish I'd noted it at the time, but on the auto setting, I think I saw PPT was at 83w for Rosetta & 88w for LHC .......
 

Fardringle

Diamond Member
Oct 23, 2000
9,184
753
126
Different projects use the CPU in different ways. I don't know the exact details, but Rosetta seems to make my computer quite a bit hotter than LHC and I think it's because LHC doesn't actually use the CPU at 100%..
 

StefanR5R

Elite Member
Dec 10, 2016
5,459
7,717
136
I see the variation of power usage and core clocks between projects on the EPYCs too. That is, the processor firmware is not perfectly able to clock each workload as high as the package power limit would allow.

On my Xeons, which are driven by the BIOS to run at all-core turbo all the time (depending on instructions in the workload, sometimes with AVX clock offset applied), the differences in power use between DC projects are somewhat more pronounced than on the EPYCs. I have power figures for three projects (each time running the project exclusively on all threads, time-averaged system power consumption "at the wall", along with median core clocks of the EPYCs):

Rosetta@home
Rosetta v4.12
x86_64-pc-linux-gnu
TN-Grid
gene@home PC-IM 1.10
x86_64-pc-linux-gnu
ОДЛК
odlk{3,min,max}@home v1.00
x86_64-pc-linux-gnu
dual E5-2690 v4
(2× 14c/28t, 3.2 GHz)
380 W
(100 %)
350 W
(92 %)
325 W
(86 %)
dual E5-2696 v4
(2× 22c/44t, 2.8 GHz)
420 W
(100 %)
390 W
(93 %)
365 W
(87 %)
dual EPYC 7452
(2× 32c/64t, 155 W PPT)
320 W
(100 %)
≈2.6 GHz
315 W
(98 %)
≈2.74 GHz
290 W
(91 %)
≈2.87 GHz

These systems are not constrained by cooling; processor temperatures are lower than what we are used to on desktops.
 
Last edited:

Assimilator1

Elite Member
Nov 4, 1999
24,120
507
126
Yea your Xeon vs EPYC power differences reflect the differences between my Ivybridge E I7 4930k & 3600.
Thanks for the extra info :)

Fardingle
I was surprised that LHC doesn't use more than Rosetta as it's WUs now use avx. Does Rosetta use it too though?
 
Last edited:

Fardringle

Diamond Member
Oct 23, 2000
9,184
753
126
Yea your Xeon vs EPYC power differences reflect the differences between my Ivybridge E I7 4930k & 3600.
Thanks for the extra info :)

Fardingle
I was surprised that LHC doesn't use more than Rosetta as it's WUs now use avx. Does Rosetta use it too though?
Good question. I haven't bothered to do any research. I just assumed that it was something to do with inefficient usage of resources in the VM that LHC runs in.
 

Assimilator1

Elite Member
Nov 4, 1999
24,120
507
126
Fair enough.

One further thought, does anyone know how much lowering CPU PPT affects single to quad thread performance?
 

Fardringle

Diamond Member
Oct 23, 2000
9,184
753
126
Benchmarks and boost clock speeds for single/quad core benchmarks/games are exactly the same on my 3900X at 105W and 142W since it never gets anywhere close to the power limit when only running a few cores.
 
  • Like
Reactions: Assimilator1

StefanR5R

Elite Member
Dec 10, 2016
5,459
7,717
136
I was surprised that LHC doesn't use more than Rosetta as it's WUs now use avx.
Do you mean SixTrack, CMS Simulation, Theory Simulation, or/and ATLAS Simulation?

SixTrack has got application builds which have "sse2" and "avx" in their names. When DC projects release application versions like this, it generally means that the compiler was configured to allow instructions of these instruction sets in the build. It does not necessarily mean that developers optimized the data and algorithms to make use of vector computing. And often enough, they actually did not take any such steps.

Does Rosetta use it too though?
No, at least not to an extent which would let the AVX clock offset kick in on my E5 v4's.
 

Assimilator1

Elite Member
Nov 4, 1999
24,120
507
126
Re LHC, for each WU app it says 'Sixtrack 502.05 (avx)'.
That reminds me, I actually want to know how I can run the Atlas app! [edit] done ;)
 
Last edited:

biodoc

Diamond Member
Dec 29, 2005
6,257
2,238
136
Here's some data on TNGrid that I collected some time back.

CPU: 3950X
MB: Gigabyte aorus master x570
RAM: 64 GB DDR4-3200
PSU: Seasonic Prime TX850 titanium
VC: Radeon VII (idle power draw is 20 watts according to sensor output)
Power is measured in watts at the wall minus 20 watts for the video card
All threads (32) are running TN-Grid tasks

TDP (105 watts) and PPT (105 watts) set in bios
Power draw at the wall: 161 watts
Boinc benchmarks...double precision 7032.64 million ops/sec....integer 71863 million ops/sec
3.893 GHz clock speed <----the effect of instruction sets such as avx must be minimal...on SGS LLR tasks the clock was 3.5 GHz
5.65 Gflops/thread....180.8 Gflops/host....1.12 Gflops/watt
Average CPU time (40 tasks): 9,654.9 s (STDEV=78 s)
Average points per task: 136.8
PPD: 39,188.4
PPD/watt: 243.4

TDP (65 watts) and PPT (65 Watts)
Power draw at the wall: 116 watts
3.03 GHz clock speed
Boinc benchmarks...double precision 6339 million ops/sec....integer 63960 million ops/sec
5.65 Gflops/thread <-----I didn't do the calculations here since this number didn't change after I reran the benchmarks so it can't be accurate
Average CPU time (60 tasks): 12,795 s
Average points per task: 147 <----this # went up a bit so I wonder if this is because the Gflops/thread # didn't drop as expected.
PPD: 31,840.5
PPD/watt: 274.5

Summary on above data:

Numbers I think I can trust are task CPU time, clock speed, and power draw at the wall. The power draw at the wall number is contingent on the idle power draw number of the Radeon VII reported in the OS (20 watts).

So going from 65 watt PPT to 105 watt PPT:
Clock speed: 3.03 GHz --> 3.893 GNz is a 28.5% increase
Power draw: 116 watts ---> 161 watts is a 38.8% increase
TN-Grid task CPU time: 12,795 s ---> 9,655 s is a 25% decrease
 

StefanR5R

Elite Member
Dec 10, 2016
5,459
7,717
136
Combining @biodoc's data with own measurements:
TN-Grid gene@home PC-IM v1.10, system power consumption at the wall and points per day

SystemCPUsPPT per socketCore clockHost performanceSystem power drawSystem power efficiencyNote
headless serverdual 14c/28t Broadwell-EPlifted for sustained turbo
3.2 GHz​
49.3 kPPD​
≈350 W​
≈141 PPD/W​
headless serverdual 22c/44t Broadwell-EPlifted for sustained turbo
2.8 GHz​
67.8 kPPD​
≈390 W​
≈174 PPD/W​
desktop with idle Radeon VII16c/32t Matisse
105 W​
3.89 GHz​
39.2 kPPD​
161 W​
243 PPD/W​
desktop with idle Radeon VII16c/32t Matisse
65 W​
3.03 GHz​
≈31.8 kPPD​
116 W​
≈275 PPD/W​
¹)
headless serverdual 32c/64t Rome
155 W​
≈2.74 GHz​
≈111.2 kPPD​
≈315 W​
≈353 PPD/W​
²)

Notes:
¹) At the time of measurement, PPD were possibly not fully converged yet, after the PPT of the system was changed.
²) PPD were not fully converged at the time of measurement, since the host was still receiving a mixture of fma, avx, and sse2 jobs with their typical small performance differences; i.e. had not reached a steady state yet in which the project server would only schedule fma jobs as the PPD optimum.

Interestingly, system #4 had a budget of 65 W for 16 cores and for 1 dual-channel memory controller (and some mostly idle I/O such as PCIe) and managed to drive the CPU cores at 3.0 GHz — whereas system #5 had a budget of 310 W for 64 cores (4×77.5 W for 4×16 cores) and 8 dual-channel memory controllers (and mostly idle I/O such as PCIe and socket-to-socket Infinity Fabric), and yet could sustain only 2.7 GHz core clock.

Obviously, the facts that system #5 has got double the amount of memory controllers per core compared to #4, and that the mostly unused socket-to-socket interconnection of system #5 still occupied a portion of the package power limit, contributed to #5 having a larger fraction of I/O portion in the power budget than system #4. But if we look at actual power use in relation to computing throughput, system #5 is well ahead, as to be expected. IOW, that I/O decreased the budget, but did not very much contribute to power use. Meaning, power saving mechanisms work well, but redistribution of power budget from I/O to cores is limited or does not happen at all.

As a side note, looking at power management when both CPU cores and I/O are mostly idle: The dual-Rome server pulls circa 90 W at the wall when not doing anything but showing an idle KDE desktop through its BMC. @biodoc, do you recall idle consumption of the desktop computer?
 
  • Like
Reactions: Assimilator1

biodoc

Diamond Member
Dec 29, 2005
6,257
2,238
136
As a side note, looking at power management when both CPU cores and I/O are mostly idle: The dual-Rome server pulls circa 90 W at the wall when not doing anything but showing an idle KDE desktop through its BMC
The 3950X is running headless with Universe@Home at full load (TDP/PPT = 105/105).

Clock freq: 3.856 GHz
Power draw at the wall: 183 watts - 18 watts (Radeon VII) = 165 watts

If I suspend Universe@home:

Clock freq: 2.195 GHz
Power draw at the wall: 63 watts - 18 watts (Radeon VII) = 45 watts

When I first bought the 3950X, I was using an old seasonic silver PSU. I replaced that with a Seasonic Prime TX850 titanium and noticed full load power draw at the wall did go down ~20-25 watts if I remember correctly. I was pleasantly surprised.
 
  • Like
Reactions: Assimilator1