Discussion Power gating GPU's vs ppd

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,478
14,434
136
So I turned down my power on my 2080TI's from 260 to 175, from 220 down to 175 on my 3070, from 325 to 200 on my 3070TI and from 400 to 250 on my 3080TI. The old total ppd was 46-47 million. I am down to 40 million, and it appears that only SOME of the units reduce the ppd. And the 3080TI is holding 9 millon ppd ! So that works out to 1500 watts for 8 cards 24/7/365 and 40 million ppd. I am happy with the compromise ! Thats 36% reduction in power for a 13% reduction in ppd. and 845 watts saved 24/7/365
 
Last edited:

cellarnoise

Senior member
Mar 22, 2017
709
394
136
I have been doing the same with my few GPUs and newer CPUs. The vendors pump the voltage / watts to the stars to try and push the frequency to the extreme detriment of work per watt.

Though there are decent tools to reduce this if you don't need the last 5% of performance or so for 20%+ total power. (% numbers given are pulled out of my main expelling gas / possible power source? of course :)
 

Endgame124

Senior member
Feb 11, 2008
954
669
136
Are you also running your 5950 CPUs in eco mode? Similar power savings can be had there, and from personal experience the drop in ppd is also fairly minimal.
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,478
14,434
136
Are you also running your 5950 CPUs in eco mode? Similar power savings can be had there, and from personal experience the drop in ppd is also fairly minimal.
I looked while redoing one, I could not find where that is ! ASUS AMD AM4 Pro WS X570-Ace is the motherboard. I have 2 of these.
 

Endgame124

Senior member
Feb 11, 2008
954
669
136
I looked while redoing one, I could not find where that is ! ASUS AMD AM4 Pro WS X570-Ace is the motherboard. I have 2 of these.
I don't have one of those to try, but on my Asus Hero VIII, It’s under advanced -> AMD overclocking-> Eco Mode.

On early bios revisions I had to set precision boost overdrive to advanced to get “eco mode” to appear as an option.
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,478
14,434
136
I don't have one of those to try, but on my Asus Hero VIII, It’s under advanced -> AMD overclocking-> Eco Mode.

On early bios revisions I had to set precision boost overdrive to advanced to get “eco mode” to appear as an option.
I googled...Not available on that motherboard.

[SOLVED] ASUS Pro WS X570-ACE - Any reason why it ...
https://forum.level1techs.com › ... › Motherboards


Feb 4, 2020 — Sadly even with UEFI 1201 with AGESA 1004 B this motherboard doesn't have the “Eco Mode” option in the “AMD Overclocking” menu (I couldn't find ...
 

Assimilator1

Elite Member
Nov 4, 1999
24,120
507
126
Well, my card is old now and was never great at F@H (normally 400k ppd), but I also undervolted it and underclocked it (the latter only because it caused F@H WU errors).
Anyway when I configured it nearly 2 yrs ago, originally my rig was drawing ~285w running MW@H (and LHC+R@H), then I dropped the GPU vcore from 1150 mV to 1050 mV and system draw dropped to 259w (same loads on CPU and GPU. GPU chip power draw dropped 21w).
So a 8.7% GPU undervolt cut system power usage by 9.1%, and better still it gave a large drop in GPU temp, about 8C! :)

After testing with F@H at that voltage, although MW had showed no errors @1350 MHz (stock clock), I had to drop it to 1266 MHz (a 6.2% cut) to get clean runs with F@H! Which interestingly is the RX 480s stock clock rate!
I would give current power figures between stock and undervolted with F@H, but F@H is being very erratic with GPU load and ppd atm, like the past minute it's been mostly sat at 0% GPU load! (although temp's show it isn't totally idle). That's at 1-2% progress, 1 spare CPU core (2 threads) for F@H.

Just checked my Ebuyer account, and I bought the RX 580 in October 2018, way too long ago! :p
JFYI, in gaming performance it's similar to a 1060 (and was a similar price), although the RX 580's FP64 performance is quiet a lot better (which is why I choose it in the end).
 

Endgame124

Senior member
Feb 11, 2008
954
669
136
My step up from 3070ti to 3080ti was approved, so hopefully in a week or two, I’ll be up and running on the 3080ti. Have you found a really sweet spot for the 3080ti?
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,478
14,434
136
My step up from 3070ti to 3080ti was approved, so hopefully in a week or two, I’ll be up and running on the 3080ti. Have you found a really sweet spot for the 3080ti?
Most of the time its holding 9 million ppd@250 watts. Depends on the units, right now its 6.5m on a 18700.
 
  • Like
Reactions: Endgame124

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,478
14,434
136
And now its on a 17257 at 9.4m

And talk about consistency... Since I set the wattage low, here is the results the last week, 40mil for 8 cards at lowered wattage.

I should specify... 5 x 2080TI, a 3070, a 3070TI and a 3080TI

1644106630683.png
 
Last edited:

Endgame124

Senior member
Feb 11, 2008
954
669
136
The 3080ti is up and running. It resumed a WU half completed by my 1080ti, and is already up to 7m ppd at 60% power. Tomorrow I’ll probably set it to 100% power and let it run while I observe during the day, but I want to be around when I do that. The power supply on this host is a Seasonic 650 titanium, so even with the card at 100% power I should have plenty of room left on the power supply, but it’s always best to be around when doing something like that.
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,478
14,434
136
So, my electric bill came. I shut down about 6 computers, but all the big video cards are still running. Bill went from $800 to $525 ! Power gating does make a difference. Still 550k ppd in CPU, and 40 mill ppd in F@H.
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,478
14,434
136
Update... I updated this box from 16 to 32 gig ram. I added 3600 cp16 but its running at 3200.

BUT, its started to fail on units on F@H using 250 watts. So I updated to 300 watts. Seems OK for now.
 

StefanR5R

Elite Member
Dec 10, 2016
5,459
7,717
136
How about running a memory checker for a decent amount of time? I don't know of a really good checker though.
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,478
14,434
136
How about running a memory checker for a decent amount of time? I don't know of a really good checker though.
Linux..... Rosetta is running fine. 27 tasks instead of 31, waiting on memory still. Needs 64 gig for 16 cores.
 
  • Wow
Reactions: Assimilator1

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,478
14,434
136
UPDATE. Now 3 of my 2080TI's have had odd problems with "interrupted", and I had to reboot the box to fix it. I suspect its due to the undervolting. 3 of 6 cards ??
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,478
14,434
136
And one more juicy bit. On one of them I checked the video drivers, and all the driver versions were greyed out. I updated the OS, and then I checked the drivers, now it was using 510, but I had to restart. It is still failing...
16:27:36:WU00:FS00:0x22: platform 0: Reference
16:27:36:WU00:FS00:0x22: platform 1: CPU
16:27:36:WU00:FS00:0x22: platform 2: OpenCL
16:27:36:WU00:FS00:0x22: opencl-device -1 specified
16:27:36:WU00:FS00:0x22: platform 3: CUDA
16:27:36:WU00:FS00:0x22: cuda-device 0 specified
16:27:44:WU00:FS00:0x22:Attempting to create CUDA context:
16:27:44:WU00:FS00:0x22: Configuring platform CUDA
16:27:57:WU00:FS00:FahCore returned: INTERRUPTED (102 = 0x66)

This box is a 1950x, my only series one threadripper left.
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,478
14,434
136
Update: NO, it is NOT the wattage. See the Rosetta thread, thats the problem.
 

StefanR5R

Elite Member
Dec 10, 2016
5,459
7,717
136
16:27:36:WU00:FS00:0x22: platform 0: Reference
16:27:36:WU00:FS00:0x22: platform 1: CPU
16:27:36:WU00:FS00:0x22: platform 2: OpenCL
16:27:36:WU00:FS00:0x22: opencl-device -1 specified
16:27:36:WU00:FS00:0x22: platform 3: CUDA
16:27:36:WU00:FS00:0x22: cuda-device 0 specified
16:27:44:WU00:FS00:0x22:Attempting to create CUDA context:
16:27:44:WU00:FS00:0x22: Configuring platform CUDA
16:27:57:WU00:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
To me, this looks like a crash of FahCore.

Reduction of the board power limit cannot cause a crash in FahCore.

Faulty RAM could cause a crash, but since this crash happened already 13 seconds after the start of FahCore_22, such a fault would cause a lot of other program crashes too. (Still, any newly built machine, or always after RAM timings were changed from defaults, it is prudent to run extensive memory integrity tests. I don't know which tests are good; I do know that memtest86 for example may require a day or so to find certain faults.)

Concurrent CPU workloads cannot cause a crash in FahCore, directly. However, I am not sure if FahCore keeps working properly when a computer becomes highly unresponsive due to swapping to disk. (Swapping may happen notably with many concurrent Rosetta v4 tasks).
 
  • Like
Reactions: Ken g6

gordonbb

Junior Member
May 25, 2022
3
10
41
So I turned down my power on my 2080TI's from 260 to 175, from 220 down to 175 on my 3070, from 325 to 200 on my 3070TI and from 400 to 250 on my 3080TI. The old total ppd was 46-47 million. I am down to 40 million, and it appears that only SOME of the units reduce the ppd. And the 3080TI is holding 9 millon ppd ! So that works out to 1500 watts for 8 cards 24/7/365 and 40 million ppd. I am happy with the compromise ! Thats 36% reduction in power for a 13% reduction in ppd. and 845 watts saved 24/7/365
Mark, long-time lurker here and yes, I am on another team, but I registered here today to reply specifically to this post.

First, congrats on you folks performance on the BOINC Pentathlon. My Team missed the registration date.

Being close to retirement with a few kids still in Post-Secondary education I only have limited funds for Distributed Computing so I must keep my purchases and Electricity in budget.

I recently picked up a 3070Ti as their pricing has finally reached a reasonable excess over MSRP.

I profiled the card after burning it in a bit to take a look at it's efficiency:
3070ti_Efficiency.jpg

Unlike Pascal and Turing cards which I observed run most efficient at their minimum Power-Limits it seems Ampere's (at least my Asus TUF 3070ti) efficiency falls off a cliff at lower Power-Limits.

What I did was to use HfM.net to watch a new WU and set it to estimate PPD based on the last 3 Frames. I then used the F@H Advanced Control log to show when a frame had just completed then adjusted the Power-Limit to a new set point using
Code:
nvidia-smi -i 0 -pl <X>
then waited for 6 frames to complete and recorded the PPD estimate at that point and moved to the next set point.

In my case I found the 3070ti seems to run well at 150W.

I also ran some comparisons for a week with a 1070ti, 2070 Super and the 3070ti all at the same 150W Power-Limit to see what the Generational improvement was:
Code:
GPU       Yield        Eff.    Value    Gen.
1070ti    1,439,799     9.6     3.0     0.0%      
2070s     2,667,237    17.8     4.7    86.5%
3070ti    3,887,798    25.9     6.1    45.7%
Efficiencies (kPPD/W); Value (kPPD/Inflation adjusted MSRP US$) and Generational Improvement (PPD %).
 
Last edited: