DirectX 12 mainly helps with CPU efficiency. It may help GPU efficiency a little, but I think only if the fps is locked.
At the same frame rate as DirectX 11, DirectX 12 halves power consumption by reducing GPU and CPU load. If you can cut CPU load in ~2-4, the difference in a 4.5W chassis will not be negligible. You're right that it doesn't impact the GPU as meaningfully, but combined with the CPU it does impact gaming efficiency.
@antihelten - although i am not sure abt the "50% improvement" there are arch improvements for reducing power consumption. intel is doing duty cycle control for GPU clocks (mentioned in Ryan's article) which helps in reducing power during idle. they are also making the GPU wider (2 vs 3 sub slices) while increasing EUs by only 20% like how they did for HD5000 vs HD4400. that means more opportunities to do power gating when not in full load
Though keep in mind that the scenario Intel presented is likely emphasizing the effect. Not saying that it won't make a nice difference in efficiency, just that I wouldn't expect that much of a gain across the board.
That is an efficiency improvement, although not at load. As far as load efficiency goes, the re-balancing of resources should help substantially. It's difficult to assess at this point whether or not Broadwell's a bigger improvement compared to Ivy Bridge's improvement over Sandy Bridge, from an end user standpoint, but the fact that it is indeed an improvement over Gen 7.5 is very clear.While DCC should certainly lower power consumption, it will only do so when the GPU is idling, and as such won't really affect efficiency.
That is an efficiency improvement, although not at load. As far as load efficiency goes, the re-balancing of resources should help substantially. It's difficult to assess at this point whether or not Broadwell's a bigger improvement compared to Ivy Bridge's improvement over Sandy Bridge, from an end user standpoint, but the fact that it is indeed an improvement over Gen 7.5 is very clear.
Idle is not zero.No it's not. Efficiency is performance/watt, and when idling you're obviously not doing any work and thus your performance is zero, and by extension your efficiency is also zero. It doesn't matter how low your power usage goes, your efficiency is still zero, since zero divided by x is always zero, no matter how low x goes.
DCC helps with idle power usage, not efficiency.
Well, you are technically correct I suppose, depending on what you mean by idle. One could argue that there is in fact some work being done at "idle" if you mean having the windows desktop open. In any case though, using less power at idle leaves more power available when you want to do work, so that is the same as being more "efficeint".
Idle is not zero.
No it's not. Efficiency is performance/watt, and when idling you're obviously not doing any work and thus your performance is zero, and by extension your efficiency is also zero. It doesn't matter how low your power usage goes, your efficiency is still zero, since zero divided by x is always zero, no matter how low x goes.
DCC helps with idle power usage, not efficiency.
But you are not correct in any useful way.And technically correct is the best kind of correct.
When a feature involves shutting off the GPU, that doesn't mean it doesn't improve the efficiency because it is turned off (and it can't do anything when it's off).Anyway with DCC the GPU is shut off completely for up to 87.5% of the time, with just the display controller running. So for that 87% of the time, the GPU is certainly not doing any useful work.
If you save by 50J by shutting off the GPU during idle, you could use those 50J instead when you really need it to improve performance (so you end up with the same battery life).As for your second statement, I really don't follow the logic here. Having more power available doesn't affect efficiency, if that was the case, higher TDP CPUs would automatically be more efficient than low TDP ones, which obviously isn't the case.
Wrong.
![]()
Duty Cycle Control improves efficiency at low frequencies.
When a feature involves shutting off the GPU, that doesn't mean it doesn't improve the efficiency because it is turned off (and it can't do anything when it's off).
But DCC does improve efficiency, and not only because the GPU is turned off. For example, if you need a 150MHz clock speed, you can run the GPU at 150MHz, but you can get an equivalent 150MHz clock speed by running the GPU at 300MHz for half a second and shutting it off for half a second. That's the idea of DCC.
If you save by 50J by shutting off the GPU during idle, you could use those 50J instead when you really need it to improve performance (so you end up with the same battery life).
No it's not. Efficiency is performance/watt, and when idling you're obviously not doing any work and thus your performance is zero, and by extension your efficiency is also zero. It doesn't matter how low your power usage goes, your efficiency is still zero, since zero divided by x is always zero, no matter how low x goes.
DCC helps with idle power usage, not efficiency.
Okay, so we basically agree. What I don't agree with is that it doesn't improve efficiency. This feature is advertised to decrease power at the same frequency, thus improving efficiency. If you don't need such clock speeds, you can clock the GPU down as low as you want and the idle power is also reduced. DCC then further reduces power.
You say, because apparently only gaming is included in what you consider for efficiency, it won't do anything for gaming since that uses those higher clock speeds, I can't comment on that because I only have this conceptual slide about DCC. So I don't know at which frequencies this becomes useful, but I could see it being used in Android games.
Lastly, your battery comparison is wrong because a larger battery doesn't reduce the amount of power you need for a certain amount of performance.
Of course I only consider gaming for efficiency, when that was what started this whole discussion (or to be more exact, running a graphics workload using DX12 was what started it), with you speculating about a 50% improvement in efficiency from architectural improvements (in this post: http://forums.anandtech.com/showpost.php?p=36621037&postcount=106).
When you don't need the GPU, any SoC that is well designed will power gate it. DCC is a legitimate way to reduce power at a certain (averaged) range of clock speeds.Neither does DCC in and of itself, instead it lowers the amount of power you use when you don't need any performance, but don't want to put the GPU into sleep mode (since you might be needing performance soon, and can't afford the latency penalty).
Not only at idle, but also other clock speeds in the "inefficient" part of the frequency range...And since idling usually involves period of not needing any performance with intermittent periods of needing a small amount of performance, a method for rapidly powering the GPU on an off, like what DCC does, is useful.
Okay, so let me get this straight. If you read from my post on, which indeed initiated this discussion about efficiency, you'll see that I never responded to the DCC discussion until this point, so I never claimed that DCC will be one of the things that will improve Gen8's efficiency (BTW, although I started this because of the DirectX 12 demo, Gen8 doesn't only apply to gaming workloads, although those would miss the DirectX improvement). I made an in my opinion reasonable assumption based on what I've seen that Gen8 will be 50% more efficient and we discussed that.
DCC was first brought up by bullzz, who said exactly the same as you; he never claimed improvements in gaming (I do speculate it might improve efficiency in certain games). You quoted him, basically reiterating his claim. Homeles then responded with a claim which I agree with, and the discussion got started, with you posting your posts (which I quoted) where you deny that DCC gives an improvement in performance per watt.
When you don't need the GPU, any SoC that is well designed will power gate it. DCC is a legitimate way to reduce power at a certain (averaged) range of clock speeds.
Not only at idle, but also other clock speeds in the "inefficient" part of the frequency range...
It is my impression from Intels slides that DCC only comes into play when you've already gone down to threshold voltage, which I would imagine would only occur at idle (or very close to it).
I don't think Intel is talking about 2D clocks when talking about DCC, like when driving the display. Seperate 2D and 3D blocks have been in existence since ages ago. 3D blocks can be completely power gated during that instance. Perf/watt issues were never about 2D, it was always about 3D, because that's where the real demand is. I can tell you with display idling even my Sandy Bridge iGPU is power gated and never active.True but even when idling you generally still need the GPU, if for nothing else, at least to drive the display controller.
You gotta remember since Haswell Intel can change the frequency so fast thanks to FiVR(think in between frames) that even in most 3D workloads it can scale down clocks in comparatively less intensive scenarios to boost performance in more demanding ones. Now with Broadwell when they need even less GPU power they can scale down clocks leakage-be-damned.instead it lowers the amount of power you use when you don't need any performance, but don't want to put the GPU into sleep mode (since you might be needing performance soon, and can't afford the latency penalty).
Based on that slide about "Inefficient Region/Efficient Region" and previous slides I've seen, that's not the case.
Similar slide was shown before when describing the benefits of GT3 at lower clocks vs GT2 at higher clocks. Basically, HD 4400 @ 15W vs HD 5000 @ 15W. The latter performs 10-20% better than former but at same power. That's rather high clocks, with GT2 @ ~1GHz and GT3 @ 600-700MHz or so(based on some user/professional reviews). We can assume based on that 600-700MHz clocks needed to run GT3 at 10-20% better performance than GT2 is close to that inefficient/efficient region.
Remember they never said "threshold/below threshold", but efficient/inefficient. Below the region all the way down to threshold voltage the leakage dominates so scaling frequency down below that level makes it inefficient in terms of perf/watt.
DCC makes "efficient" scaling possible without messing around with voltage, because not all 3D applications need GPU @ 100%.
I don't think Intel is talking about 2D clocks when talking about DCC, like when driving the display. Seperate 2D and 3D blocks have been in existence since ages ago. 3D blocks can be completely power gated during that instance. Perf/watt issues were never about 2D, it was always about 3D, because that's where the real demand is. I can tell you with display idling even my Sandy Bridge iGPU is power gated and never active.
You gotta remember since Haswell Intel can change the frequency so fast thanks to FiVR(think in between frames) that even in most 3D workloads it can scale down clocks in comparatively less intensive scenarios to boost performance in more demanding ones. Now with Broadwell when they need even less GPU power they can scale down clocks leakage-be-damned.
Now obviously Vmin is not quite the same as threshold voltage, but even then Vmin should still only come into play when idling.
Let me take you to another presentation about Vmin.
Spring IDF 2013 titled "The HD Graphics Architecture in the New World of Low Power Computing".
![]()
![]()
The first slide gives a brief explanation about what the "Vmin" really is.
The second slide explicitely refers to graphics, "1x size" vs "2x size". There's yet another slide about using "Fmax @ Vmin" and race-to-halt to save power.
It doesn't seem ambiguous to me that GT3 is the "2x size @ Vmin" and GT2 is "1x size".
Also, GPUs don't have only two states, that is "Full Load" and "Zero Load". We are way beyond that. And the FiVR allows switching to be fast enough so in cases where the frame is more CPU-bound and less GPU-bound you can ramp the clocks down for GPU so the CPU can use the extra power. And if the frequency needed is below "Fmax @ Vmin" now with Broadwell you can save credible power rather than achieving nothing.
You missed the part "Also, GPUs don't have only two states, that is "Full Load" and "Zero Load"".
Look at the power vs performance graph of the second image of second slide.