A CALL TO ARMS: The likely power-consumption driver bug for SLI 970/980

BonzaiDuck

Lifer
Jun 30, 2004
15,725
1,455
126
There is another fairly current thread here in which I crossed paths with someone rocking 2x GTX 980 in SLI. Comparing notes on my 2x GTX 970 SLI setup, we concluded that we had the same SLI quirk.

And we had done proper, clean NVidia driver installation for version 347.52.

These GTX 970/980 cards are supposed to settle down to a power-saving mode with 7 to 10% power consumption for each card, and core/memory of about 135/350 (more or less).

When we enabled SLI in the NV Control Panel, the GPUs settle to this aforementioned P-state. Upon re-boot, the GPUs are running at 40% power-consumption and ~ 912/3000. To get it to behave properly, you then have to disable, then re-enable SLI.

So there are two folks I know of so far at Anandtech who've described this anomaly, and there was another enthusiast forum where it was being discussed. IF . . . that forum was the NVidia forum, I'm sure it has NV's attention, but I cannot remember.

If your SLI'd Maxwell cards exhibit the same problem, please create an account for yourself at NVidia's web-site, and submit either a bug report or a tech-question submittal suggesting it as a bug-report item, but asking for a resolution to the problem if NVidia has the answer for fixing it at the user end.

I'm not eager to download the latest Beta driver, and I'd like to get past this . . . little . . . imperfection. Maybe this has been more extensively discussed here, but my exchange with our forum colleague here elicited his mild surprise. I wouldn't know.

But the Maxwells run at least 10C cooler in normal 2D usage and consume as much as 80W less at the wall when they're behaving properly.

Also -- if there's some tweak that could be made through NVidia Inspector, which exposes the P-states of the cards -- if someone knows of something like that which preserves the full speed capability when the cards are under load -- I'd like to hear it.
 
Last edited:

NTMBK

Lifer
Nov 14, 2011
10,237
5,020
136
You should report this to a few tech news websites- most of them have a "report a story" button. Get a bit more publicity, and you're more likely to get a response.

Hope it gets fixed quick. :thumbsup:
 

BonzaiDuck

Lifer
Jun 30, 2004
15,725
1,455
126
You should report this to a few tech news websites- most of them have a "report a story" button. Get a bit more publicity, and you're more likely to get a response.

Hope it gets fixed quick. :thumbsup:

Even I am unsure that it's an "NVidia cause," because it could just be some configuration of startups or the OS at boot time. But like I said, another member with 2x GTX 980 SLI observed the same thing, and there have been other forums (beyond Anandtech) where it was discussed in recent months.

The NVidia folk asked me for a GPU_Z log file after I put the system through some gaming; I sent them two: One created after SLI was enabled after boot-time, to show the transition from 135 Mhz GPU clock to ~1,300 -- whatever the stock setting was -- and then back to 135. Then I gave them a file with GPU-Z configured as a "start-up" to show what happens at boot time, and how the clocks vary in that scenario between idle and a gaming session, then back again. It would also show the temperatures and differences in thermal profile.

They told me after they received the log files, they'd be back after 24 to 48 hours-something. I just don't think this is something that derives from my hardware or software configuration. What else could it be, but a GEForce driver problem?
 

BonzaiDuck

Lifer
Jun 30, 2004
15,725
1,455
126
What about asking in eg H forum?

What is the "eg H" forum?

I only posted this here because I spend a lot of time at Anandtech and found another member who observed the same problem.

I figure I'm "doing my bit," but I also limit my forum memberships -- too many account-ID's and passwords to remember.

But we'll see what they say when NV tech-support responds early in the forthcoming business week.

The cards are fine -- despite the discontent over the VRAM question. But this -- if it's a driver issue -- is annoying.
 

BonzaiDuck

Lifer
Jun 30, 2004
15,725
1,455
126
Here's another update on my interaction with NV customer support.

I've been elevated to "Level 2" support; they're sending me reassurances almost daily.

I mentioned -- reminded them -- that I'd discovered someone in this forum with 2x GTX 980 SLI with the same symptoms. I'd also seen a discussion of it in forums other than Anandtech.

Sometimes, you get confirmation of something that is logically "conclusive." Other times, the evidence isn't conclusive, but it increases the probability that a particular diagnosis is correct.

My fellow forum member here, and the posters at the other forum may have installed windows and drivers is some particular way -- all of us -- or we all use common software like AfterBurner.

OR -- it's a real driver bug. And you have to wonder. How many of us here think it's normal for your graphics cards to downclock to 912 Mhz at idle, who may even be unaware that the cards should really downclock to 135? It seems to be a problem peculiar to SLI. Since I myself may not have paid full attention to the clocks when I was running a single card, that is also inconclusive, but my recollection seems to support the SLI assessment.

Anyone else noticed this?
 

96Firebird

Diamond Member
Nov 8, 2010
5,711
316
126
Is it possible for you to disable SLI (possibly even remove the second card altogether) and see if the issue persists? Also, you may have revealed this in your other threads, but what displays are you using, how are they connected, and are they always active?
 

Alpha0mega

Member
Aug 26, 2010
73
1
71
970 SLI here, not-OCed (apart from the minor factory OC). I just checked, and both my cards are downclocking to 135 MHz just fine, according to GPU-Z. Single 2560x1440 monitor, driver is 347.52, running Afterburner (but without RivaTuner currently).

Is there some specific step I should take to reproduce your problem?
 

kasakka

Senior member
Mar 16, 2013
334
1
81
970 SLI, both downclock to 135 MHz at 120 Hz refresh rate, main card is 899 MHz on 144 Hz refresh rate.
 

Vesku

Diamond Member
Aug 25, 2005
3,743
28
86
Make sure to note whether it is occurring with single or multiple monitors. Multiple monitors has been the biggest cause of high idle clocks historically.
 

BonzaiDuck

Lifer
Jun 30, 2004
15,725
1,455
126
Is it possible for you to disable SLI (possibly even remove the second card altogether) and see if the issue persists? Also, you may have revealed this in your other threads, but what displays are you using, how are they connected, and are they always active?

I'll start with your quote, but answer all the rest as well.

I'm going to re-check all this, but the issue arises at boot-time with SLI, and I'm pretty sure I only need to disable SLI to eliminate at it for reboot. If I disable then re-enable SLI, the cards behave properly -- until the next boot-up.

Then there's the display issue. I had initially set up this machine like my other system (GTX 780, single-dGPU) with desktop monitor output and HDTV output. When I discovered this "power-consumption" anomaly, I first chose to simply disable the HDTV in NV control panel, and soon thereafter, I simply disconnected the cable at the graphics card.

The problem persisted. And it was there before my s***-chunk 60 hz Hanns-G took its final dump and died, and it is there now with this wonder of a BenQ monitor now -- either at 60 hz or 120 hz.

AlphaOmega said:
970 SLI here, not-OCed (apart from the minor factory OC). I just checked, and both my cards are downclocking to 135 MHz just fine, according to GPU-Z. Single 2560x1440 monitor, driver is 347.52, running Afterburner (but without RivaTuner currently).

Is there some specific step I should take to reproduce your problem?

I can't say. My CPU IS OC'd, so I suppose I should see what happens if I set it back to stock clocks. After boot and at idle, it's only running at 1,600 Mhz EIST speed and power. The GTX-970's were only briefly OC'd through Afterburner and a 2-hour Kombustor test, and I reset everything when I discovered the "anomaly" -- with no improvement. I think I even uninstalled and reinstalled Afterburner and Kombustor at stock clocks -- as I'd done with the (clean) install of the 347.52 driver. My systray icons only include Kaspersky, Steam, APC's UPS monitoring software, CCleaner and more recently -- the BenQ's "Display Pilot." The problem was there before Display Pilot was installed.

Just a minute . . . let me do some things and I'll update this post . . .

OK . . . for tonight (this morning at 4:53AM) -- I've disabled SLI and rebooted to confirm what I must've seen before: the cards behave properly @ 135 Mhz idle when booting to disabled SLI. I want to check my BIOS and save the current profile before I revert to "stock," just to be sure any of the final tweaks were saved to the 4.7 Ghz profile (since I last saved it). Tomorrow -- ah, later today -- I'll do that.

There is some common factor between those of us who see this, and it's not the particular NVidia model -- we see it with both 970 and 980 in SLI. Since you (AlphaOmega) don't see it and you're running the same driver, I couldn't conclude one way or the other if it's (a) the driver, (b) the driver interacting with something else, or (c) . . . "something else."

All my event logs are "in the blue." My Windows Experience score is 7.8 -- limited by CPU and/or RAM (both 7.8).

The only way you could replicate this, besides using the same software and peripherals, is to reboot the system after you see the problem with SLI enabled. That is, if you notice the 912mhz idle core speed, disable SLI and find it running at 135, then re-enable SLI -- it would still show 135 and behave well. At reboot, the idle clock reverts to 912.

Could it be a particular manufacture of card? MSI versus eVGA versus "reference" design? Or could it be an NVidia BIOS version? I can't say. I'll also say this: I've never flashed a gfx BIOS before, and I'd be especially careful to do it with two cards. I don't even know the procedure yet. Would you remove a card and flash one card at a time? Not something I wouldn't spend days reading up on before I tried it.

Also, I'd only flirted with SLI configurations before -- this is the first in six years. As far as I know, the bridge connectors are standard. I'm pretty sure I'm using a connector from either of my two P8Z68 retail boxes. And it shouldn't matter whether they're installed "this way or that way." There's no "notch," and I also read that reassurance somewhere . . .

So how is it that SOME of us see this, and others don't? I don't know. I don't . . . freakin' . . . know.
 
Last edited:

BonzaiDuck

Lifer
Jun 30, 2004
15,725
1,455
126
Haven't reset the CPU yet -- need to take my car to the mechanic this morning -- I will eventually see what happens when the CPU is running at stock.

In meantime, I can't tell how significant this is. But I can put the computer into "hibernate" and when I start it up again, the SLI of the 970's is behaving properly.

As you all know, raising a computer from hibernate causes it to post as if from a cold start -- the only difference: "Resuming Windows" instead of "Starting Windows."

This has something to do with the initialization at unhibernated boot-time. Could I conclude that this is therefore some sort of software or driver problem? And if it behaves this way, how could it be a problem in the BIOS of the graphics cards? It seems . . . unlikely.
 

Alpha0mega

Member
Aug 26, 2010
73
1
71
I can't say. My CPU IS OC'd, so I suppose I should see what happens if I set it back to stock clocks.

Oh, I guess I wasn't clear. I was referring to the GPUs not being overclocked. I have given my 5820k a mild OC, 4.0GHz at 1.1v. That's what it's running at.

My "tray programs" are: Afterburner, AIDA64, Rainmeter, Logitech Gaming Software and IMs (Trillian/Skype mostly). I am currently running Afterburner without RivaTuner, which I need to check FPS etc., because I have been having some system hang issues for the last few months, while gaming, and have been trying to isolate the problem. Might make a thread about it later, if I run out of things to try.

If you have a spare HDD laying around, or even a large enough thumb drive, perhaps you could do a clean install of Windows, installing nothing but the video drivers.

Another thing, I am using Windows 8.1. I know Win 8.1 has WEI, albeit hidden, but your mention about WEI scores makes me thing that you are on Win 7. Is that correct? Perhaps that's a factor too.
 

BonzaiDuck

Lifer
Jun 30, 2004
15,725
1,455
126
Oh, I guess I wasn't clear. I was referring to the GPUs not being overclocked. I have given my 5820k a mild OC, 4.0GHz at 1.1v. That's what it's running at.

My "tray programs" are: Afterburner, AIDA64, Rainmeter, Logitech Gaming Software and IMs (Trillian/Skype mostly). I am currently running Afterburner without RivaTuner, which I need to check FPS etc., because I have been having some system hang issues for the last few months, while gaming, and have been trying to isolate the problem. Might make a thread about it later, if I run out of things to try.

If you have a spare HDD laying around, or even a large enough thumb drive, perhaps you could do a clean install of Windows, installing nothing but the video drivers.

Another thing, I am using Windows 8.1. I know Win 8.1 has WEI, albeit hidden, but your mention about WEI scores makes me thing that you are on Win 7. Is that correct? Perhaps that's a factor too.

That COULD be the case. When we put our heads together here at the forums, some of those details go unmentioned. The member with the two 980 cards was -- is -- Stringjam. I should look at the thread where the "commonality" became apparent, or we can ask him.

Last I looked many months ago, about 60% of forum participants are still using Win 7. I've taken the attitude these days to make such upgrades common to the entire household, but the fam-damn-ily may be less eager to upgrade to Win 8. I made a special effort during 2014 to assure that all these systems (with Win 7) are running with event-logs all "blue," successful backup to WHS every night, etc. I know what happens with OS upgrades, and it requires more of the same efforts.

It could be -- if Stringjam reports using Win 7, or I do another search and find the discussions of the problem elsewhere a second time -- that something like "Win 7" could be the common thread. There is nothing conclusive in such observations, but it isn't much different than Rappoport's "Decision Analysis" for business -- choosing one or another path through a search tree based on new information about probabilities.

So I can add to my list of POSSIBILITIES that NVidia could produce a driver in which there is no operative "bug" running in Win 8, but something that emerges with Win 7.

Don't you think? This is going to be solved! I just have to wait patiently, while I poke around on my end some more.
 

Alpha0mega

Member
Aug 26, 2010
73
1
71
The Windows difference does seem promising, unless the problem is a random "ghost in the machine" type of craziness that effects systems sometimes. Perhaps the others in this thread who mentioned that their cards are downclocking properly can chime in with their OS info.

Many months ago, I wasn't planning on ever using Windows 8 because of the UI. I had decided to skip it fully, and go to Win 10 (or 9 as it was called then). But when the time came for me to upgrade to Haswell-E and Maxwell, I decided to get 8, since 7 doesn't have higher than DX 11 (no .1 or .2) which Maxwell supports, the underpinnings of Win 8 are better and Win 10 was just too far away to run old OS on new hardware. After tweaking it some and installing Classic Shell, I am mostly happy with it.
 
Last edited:

BonzaiDuck

Lifer
Jun 30, 2004
15,725
1,455
126
The Windows difference does seem promising, unless the problem is a random "ghost in the machine" type of craziness that effects systems sometimes. Perhaps the others in this thread who mentioned that their cards are downclocking properly can chime in with their OS info.

Many months ago, I wasn't planning on ever using Windows 8 because of the UI. I had decided to skip it fully, and go to Win 10 (or 9 as it was called then). But when the time came for me to upgrade to Haswell-E and Maxwell, I decided to get 8, since 7 doesn't have higher than DX 11 (no .1 or .2) which Maxwell supports, the underpinnings of Win 8 are better and Win 10 was just too far away to run old OS on new hardware. After tweaking it some and installing Classic Shell, I am mostly happy with it.

Were you able to "upgrade in place," or did you do a fresh, basic install and then proceed to rebuild the system with your software installations?
 

BonzaiDuck

Lifer
Jun 30, 2004
15,725
1,455
126
Full format and fresh install, as I have always done.

That had always been my own strategy. Could be a weekend's work, the longer it took for a good Win (x-1) installation to evolve, to make a bare-metal install of Win x.

I suppose you could do it two ways, with a spare disk, and see whether the necessary tweaks were more troublesome with the upgrade path.

Perhaps I should explore a dual-boot option, to cushion the transition.

Even so, if the NVidia driver needs a tweak for a new version that eliminates some . . . imperfection . . . under Win 7, they shouldn't hesitate to fix it.

As long as the system remains stable (and it is . . ), I suppose I can choose "hibernate" instead of "restart" or "shutdown" until it's rectified.

Well, it's a wake-up call for Win 7 die-hards. I'm surprised I've held off jumping on the Win 8 bandwagon as long as I did.
 

Alpha0mega

Member
Aug 26, 2010
73
1
71
The fact that Hibernate doesn't have the same problem is strange. It seems that there is some service or software starting up (or perhaps not starting up) that's causing the problem. I have read that sometimes you have to add a delay to Afterburner to make sure its OC settings take effect. Maybe something similar to that has been happening in your case. Add a little delay to the startup of Afterburner).

Try completely disabling Afterburner too(no autostart on boot). Another thing, I have set my power target in Afterburner to 106% (the highest it goes for my Zotac 970s), unlinked from the temps. Doubt that has any impact in this case, but who knows?
 

BonzaiDuck

Lifer
Jun 30, 2004
15,725
1,455
126
The fact that Hibernate doesn't have the same problem is strange. It seems that there is some service or software starting up (or perhaps not starting up) that's causing the problem. I have read that sometimes you have to add a delay to Afterburner to make sure its OC settings take effect. Maybe something similar to that has been happening in your case. Add a little delay to the startup of Afterburner).

Try completely disabling Afterburner too(no autostart on boot). Another thing, I have set my power target in Afterburner to 106% (the highest it goes for my Zotac 970s), unlinked from the temps. Doubt that has any impact in this case, but who knows?

I had concerns about this from the beginning.

First, I feel I need to use AFterburner just for the custom fan-control settings. I need to go back and review whether these settings require the "Apply overclocking at system startup," but my foggy memory gives me the suspicion that I had to re-enable the user-custom fan-control feature when I disabled "Apply overclocking . . . ."

I had set everything back to stock defaults except the fan control after I discovered this power-consumption anomaly (for which I am not alone.)

I need to address these issues some more in forthcoming days, but at the moment, I need to get rid of some alligators at the top of the pile. ("up to my a** in alligators" -- you know the old saying.)

You may conclude that I'm confused about Afterburner's workings at startup, and you can provide as much detail as you care to for my illumination.

IN "settings" I have "Start with Windows" unchecked, but currently the "Apply overclocking . . . " is active/selected. [Again, I think I did this because I noticed the fans at 0% and I want them to run at 35% at idle.]
 

BonzaiDuck

Lifer
Jun 30, 2004
15,725
1,455
126
I just tried something.

To recap, I need Afterburner if there are no other options for thermal-fan-control of the NVidia 970's twin cooling fans.

If anyone knows of a way I could become independent of using Afterburner for that purpose -- please tell -- please share.

AlphaOmega mentioned tweaks to the way afterburner loads.

In addition to "apply OC settings at [boot-time][startup]," there are options in the settings menu to "start with Windows" and "start minimized."

The first option is something completely different from the second and third. Or at least I so suspect, because any clock changes I made before "restart" are effective afterward.

But choosing to allow AB to start with windows and appear in the system tray has completely changed things.


After making that selection (SLI still Enabled), closing AB and selecting "Restart" from the Start button, the system does your usual restart and post, followed by my password entry.

When I get to the Windows desktop and open Afterburner from its system tray icon, I get <= 10% power consumption (roughly for both gfx cards).

Suddenly, there's no "problem." But it seems that SLI and Afterburner are joined at the hip for some reason of which I'm not sure.

I won't say "FALSE ALARM, GTX 9x0 SLI users!" -- but for some reason, this must have been the problem all along -- with me, with other forum posters like Stringjam -- with the folks at off-Anand forums elsewhere posting over the last few months.

But, so?! We figured out how to recapture the power-saving potential with SLI'd cards. I don't have to make any changes to Afterburner configuration with a single card, SLI-disabled.

So . . . what is going on with this?!

And -- does ANYONE have an idea how to impose custom thermal-fan-control on NVidia cards without using AfterBurner?

I'm more puzzled now than I was before . . . . o_O:(
 

Alpha0mega

Member
Aug 26, 2010
73
1
71
So having Afterburner start with Windows has solved that problem? Woohoo! For at least getting this far!

However, I am a little confused by something. You are saying that you were using AB to set a custom fan profile, yet you were not having it start with Windows before? This is surprising to me, since AFAIK, Afterburner has to be running for its fan profile to be in effect. I haven't tried it with my Maxwell cards, but that was the case with all my previous ones. If I turned off Afterburner (or didn't have it load with Windows), any fan profile would not work.
 

BonzaiDuck

Lifer
Jun 30, 2004
15,725
1,455
126
So having Afterburner start with Windows has solved that problem? Woohoo! For at least getting this far!

However, I am a little confused by something. You are saying that you were using AB to set a custom fan profile, yet you were not having it start with Windows before? This is surprising to me, since AFAIK, Afterburner has to be running for its fan profile to be in effect. I haven't tried it with my Maxwell cards, but that was the case with all my previous ones. If I turned off Afterburner (or didn't have it load with Windows), any fan profile would not work.

Well, that had been my perception! But what does the graphics-obsessive NVidia user do after booting to the Windows desktop? He starts Afterburner for the monitoring. EDIT: What I DO know: If I unchecked the "apply OC at startup" option, and then I OPENED Afterburner after restart/reboot, it would show the fans running at 0%. Opening "settings,' I'd find the "user custom fan-control" option unchecked, and checking it would then enforce my custom-configured fan curve.

I will have to test again on that particular issue. But it was my perception that I lost the fan control when I unchecked the green-button option "Apply OC settings at startup." And at that time, Afterburner wasn't loading its system tray F-22-icon, because the other items in the "settings" menu were unchecked.

I suppose this really boils down to "KISS" and what you actually need to over-clock or custom-configure any hardware. For instance, with CPU-overclocking, the prevailing advice of veterans to noobs is: "Go into the BIOS to do it."

But with dGPU's, the most available option is a GUI interface. If there is a way of changing graphics card BIOS' without such an interface, I don't know what it is. At that point, the question remains: "What changes are effective at bootup without starting the GUI software, and what changes require the software?"

If this were some terrain in the wild, it's like saying "I see there is shade over there, there is fresh water here . . . " But you don't know your GPS coordinates. I feel a bit confused about the connection between AFterBurner and the cards with which it was bundled.

Perhaps a different slant on it follows here. "Do you make changes with AFterburner which are permanently set in graphics BIOS until you change them again? Are the fan-control settings making changes to the BIOS? What changes require loading AB or similar software -- no matter what? How does all that work? Is there some "profile" saved by Afterburner which enforces OC and fan-settings? Or are the changes left as-is in the graphics BIOS -- as I said -- until the next time you change something in Afterburner?

My enthusiast-GPS is going crazy right now.
 
Last edited: