Video Card Core Overclocking Scaling Analysis (AMD and Nvidia)

Termie

Diamond Member
Aug 17, 2005
7,949
48
91
www.techbuyersguru.com
Just wanted to pass along a scaling analysis I've done on video card core overclocking. It's a follow-up to my findings on VRAM overclocking, which generated some interest here: http://forums.anandtech.com/showthread.php?t=2301435.

Enjoy and discuss!

Introduction

Test Bench: Intel i7-3770K@4.4GHz, Asus Maximus V Gene Motherboard, 16GB DDR3@1866MHz, NVidia GeForce Driver 314.22, AMD Catalyst Driver 13.4

In this test, we conduct overclocking scaling analysis on an EVGA GTX670 FTW and a Sapphire Radeon HD7870 OC. As it turns out, both of the video cards in this test have roughly the same amount of overclocking "headroom," about 15 percent, but that is entirely by chance. Even if you took a specific brand and model of video card, you could likely find up to 10 percent variability in overclocking headroom. Ultimately, that is not what this article is about - what we're looking at is what you get for each incremental increase in frequency. Our results can be roughly scaled up or down based on how lucky you are with your particular card.

For each of our video cards, we tested four different core frequencies, at intervals of exactly 5%. For our Nvidia card, we are referring to the "Boost" frequency, which is higher than the published core frequency. We carefully monitored this boost frequency during testing to make sure it remained constant (Nvidia's Boost feature can vary with temperature and load, which makes it somewhat difficult to benchmark). Note that for both cards, a VRAM overclock of approximately 15 percent was also applied, so as to minimize the extent to which memory bandwidth limited scaling of the core overclock. While both cards were factory-overclocked, we simply ignored those overclocks for purposes of our testing, using a baseline frequency at which reference GTX670 and HD7870 video cards would operate. One additional note - every benchmark below was tested three times to minimize the effects of test-to-test variability. We report the averaged results.

3DMark Fire Strike Performance Preset

Our first test is the recently-released 3dMark Fire Strike benchmark. We analyze only the Graphics Score, as the overall score is affected significantly by other components in the system, but here we're trying to isolate the video card as much as possible. This benchmark utilizes all of the latest DX11 features, and runs internally at 1920x1080, and is then scaled to the resolution being used on the test system.

3DMarkFSCore670.jpg

3DMarkFSCore7870.jpg


Perhaps not surprisingly, the synthetic 3DMark Fire Strike benchmark shows some of the best overclocking results of all the benchmarks we ran, with a 12.1 percent boost on the GTX670 and an 8.7 percent boost on the HD7870. Also of note, the scaling curve is relatively linear, indicating that the core is likely the limiting factor in this test, especially given the pre-overclocked VRAM on our test cards.

Metro2033 Frontline Benchmark (1920x1080, 4x Anti-Aliasing, Maximum Settings, No Nvidia-Specific PhysX)

For our game tests, we'll go chronologically by age of the game, in case we see a pattern related to the vintage of the game engine being used in relation to core overclock scaling. Metro2033, released in March 2010, happens to be our most taxing benchmark, likely in part due to the inefficiency of its game engine. It uses very complex lighting, blur, and fog effects, but it's possible that its early DX11-based game engine was just slightly ahead of its time. There is no question that it is not the best looking game in this test, despite the strain it puts on our cards.

Metro2033Core670.jpg

Metro2033Core7870.jpg


Metro2033's built-in benchmark demonstrated scaling that was starkly different from 3DMark. It was by far the worst of the six benchmarks we ran, with 6.1 percent faster performance on the GTX670 and 6.0 percent faster performance on the HD7870 when both are overclocked 15 percent. We previously found that this game responded very positively to VRAM overclocks, so our assumption is that the graphics engine is simply more bandwidth-contrained that the typical game engine.

Just Cause 2 Dark Tower Benchmark (1920x1080, Maximum Settings, 8x Anti-Aliasing, No Nvidia-Specific Water Effects)

Our next game test is Just Cause 2, which was also released in March of 2010. While it doesn't have quite the same number of effects as Metro2033, it arguably looks at least as good artistically. As opposed to Metro2033, Just Cause 2 is by far the least taxing of the game engines tested here.

JC2Core670.jpg

JC2Core7870.jpg


Just Cause 2 illustrates the starkest example of how overclock scaling can vary from architecture to architecture. While the GTX670 runs 10.8 percent faster with a 15 percent overclock, the HD7870 only runs 5.1 percent faster. Interestingly, both cards demonstrate a stark plateau in overclock effectiveness, perhaps because the game engine is running into a bandwidth limitation caused by the 8xAA used in the test. Other than that graphical element, the graphics are the least demanding of all of the benchmarks in this article.

Continued below...
 
Last edited:

Termie

Diamond Member
Aug 17, 2005
7,949
48
91
www.techbuyersguru.com
Battlefield 3 Swordbreaker Single-Player Run-Through (1920x1080, Ultra, 4x Anti-Aliasing)

Our next game is the ever-popular online shooter, Battlefield 3, released in October 2011. One of the first AAA games to employ DX11 effects, Battlefield 3 still looks great, although it will soon be eclipsed by Battlefield 4, scheduled to appear this fall. Being the only game we test here without a built-in benchmark, we conducted three 60-second run-throughs on the single-player Swordbreaker level for each overclock, collecting average framerates via FRAPS. Note that while overall frames-per-second would likely be lower in multi-player rounds, the overclock scaling should be the same. Furthemore, there is far too much variability in a multi-player match to accurately measure the effects of incremental overclocks, so we stick with what works for purposes of this test.

BF3Core670.jpg

BF3Core7870.jpg


The HD7870 gets the scaling win here, with 9.4 percent higher performance at a 15 percent overclock, versus 8.5 percent for the GTX670. Note that unlike most of the other benchmarks, this one seems to scale better the higher the overclock goes.

Hitman: Absolution (1920x1080, Ultra, 4x Anti-Aliasing)

Released in November 2012, Hitman: Absolution is a single-player stealth-action game. It employs sophisticated DX11 effects, and is arguably the best looking game released in 2012. Interestingly, we found test-to-test variability on the GTX670 to be quite high, potentially due to CPU-bottlenecking that has been documented in this benchmark. We found repeated tests at the same core clock yielded results differing by as much as 4 percent, making the benchmark less useful than others at demonstrating overclock scaling. We actually repeated each benchmark five times on the GTX670 to ensure that our average accurately reflected the effect of each overclock setting. We'll likely be retiring this benchmark from our test suite after this article due to its inconsistency.

HitmanCore670.jpg

HitmanCore7870.jpg


Despite those benchmarking issues, we still have results to report. We can score another win for the HD7870 in the scaling contest, as it gains 7.6 percent versus 7.3 percent for the GTX670 when both are overclocked by 15 percent. This game performs particularly well on the AMD architecture, and this is the closest the HD7870 gets to catching our GTX670 in any of the benchmarks tested here.

Tomb Raider (1920x1080, Ultimate Preset, FXAA)

Released in March of 2013, the reboot of the Tomb Raider franchise brought new life to Lara Croft, while also providing gamers some of the best graphics yet seen on a PC. In partnership with AMD, Crystal Dynamics, the developer of Tomb Raider, introduced a new hair modeling system called TressFX, which not without its flaws, truly revolutionizes the way we see characters in games. Our tests, performed using the Ultimate Preset, have TressFX enabled, but use FXAA rather than the more demanding multi-sampling anti-aliasing. We use the built-in benchmark, which while not necessarily employing all of the DX11 effects available in the game, presents a signficiant load for our video cards.

TombRaiderCore670.jpg

TombRaiderCore7870.jpg


This turns out to be the test with the best scaling results on both video cards - we see the GTX670 netting a 13.2 percent improvement with a 15 percent overclock, while the HD7870 improves by 10.2 percent. We'd conclude that other than 3DMark Fire Strike, this is the closest benchmark we have to a pure GPU benchmark. It clearly requires a lot of processing power, and because we aren't using MSAA, which was used in all of the other game tests, the memory bandwidth doesn't get in the way. It's also quite consistent. These traits makes Tomb Raider a great benchmark to test overclocks and compare various cards.

Conclusion

So, what have we found? Well, we've certainly shown that GPU overclocks never scale at 100%, although in some games, under some circumstances, they can come close. On average, we found overclock scaling to average 64 percent on our GTX670 and 52 percent on our HD7870. Thus, with a 15 percent core overclock, the GTX670 gained 9.7 percent in performance, and the HD7870 gained 7.8 percent. As we've already shown, VRAM overclocking yields additional benefits. The HD7870, which has only about 80 percent of the memory bandwidth of the GTX670, definitely needs a VRAM overclock to perform at its best, and appears to run out of bandwidth in some of the benchmarks we ran even with a VRAM overclock, leaving the core running at below 100 percent capacity. If we were to offer a generalization, we'd say that the older the game engine, the more likely a memory bandwidth bottleneck will come into play, as you can always add on additional levels of anti-aliasing, but you can't make an old engine use more sophisticated graphical effects, which typically require more raw processing power rather than bandwidth.

Ultimately, we'd say that a combined 15 percent core overclock and 15 percent VRAM overclock will yield gains of about 15 percent, but most games will respond more positively to one or the other, based on which component of the video card is most taxed. If we were to recommend a best course of action for GPU overclocking enthusiasts, it would be to overclock the core and the VRAM separately to a level that is stable in several benchmarks, and then set both about 5 percent below the maximum overclock achieved for each. This way, you get the benefits of overclocking regardless of whether a game is hungry for more core processing power or more memory bandwidth, and you limit the degree to which you might find instability due to a combined overclock that is too close to the limit.

Another takeaway from our testing is that while both of our cards had similar overclocking headroom, the common wisdom is that in the current generation of cards, AMD has a slight advantage in overclocking headroom, at least on video cards with overvolting capabilities (which are becoming increasingly rare). While our sample size is certainly too small to pass judgment on this, we don't think you should base your purchasing decision on which card has a reputation for great overclocking headroom, as every card will overclock differently. Furthermore, neither GPU manufacturer has a "special sauce" that makes its overclocks magically yield gains in a 1:1 ratio.
 
Last edited:

96Firebird

Diamond Member
Nov 8, 2010
5,738
334
126
Nice job Termie, very thorough. I took a quick look at the findings, but I will give it a full read later today.

And thanks for keeping the percentage scale the same for both graphs.
 

Jimzz

Diamond Member
Oct 23, 2012
4,399
190
106
Notice the gtx670's vram is at 6800mhz, which gives GK104 the bandwidth it needs to scale with core overclocks.


This.

If you want to test the cores set the memory at the same speed. Both the 670 and 7870 use 256bit memory bus.

Instead all this showed was that memory bandwidth helps a lot as the Mhz of the core is increased. duh
 

tviceman

Diamond Member
Mar 25, 2008
6,734
514
126
www.facebook.com
This.

If you want to test the cores set the memory at the same speed. Both the 670 and 7870 use 256bit memory bus.

Instead all this showed was that memory bandwidth helps a lot as the Mhz of the core is increased. duh

It's common knowledge that GK104 is much more bandwidth strapped than Pitcairn.
 

wand3r3r

Diamond Member
May 16, 2008
3,180
0
0
Very interesting and well done! I think you found a great niche that isn't covered (at least not widely nor in such a way). I hope to see a 7950 or 7970 as true competitors to the 670 instead of the slightly more cut down chip, I wonder if they gain better with clocks.

Another great Termie investigation!
 

ICDP

Senior member
Nov 15, 2012
707
0
0
Weird, I was under the impression that GK104 scaled poorly with clock speed.

Thanks for sharing!

As TViceman pointed out the GK104 needs a good VRAM overclock to really shine. Scaling with just he core overclock is not that great at all due to the memory bandwidth limitation.

Thanks for sharing these test results Termie.
 

notty22

Diamond Member
Jan 1, 2010
3,375
0
0

wand3r3r

Diamond Member
May 16, 2008
3,180
0
0
It's probably what he had on hand. Check the site in his sig, there are some reviews of memory OCing and crossfire etc.
 

Termie

Diamond Member
Aug 17, 2005
7,949
48
91
www.techbuyersguru.com
Why are you using the 7870?

Fx1 if you want to send him a free 7970 I'm sure he won't mind.

LOL, I'm game for that!

Yes, I only have these two cards to test with right now.

By the way, on the issue of the VRAM overclock, I applied the maximum stable VRAM overclock to both cards, by which I mean the highest overclock that achieved positive scaling. In the case of the GTX670, it was about 13.5%, in the case of the HD7870, it was about 16%. I simply intended to give the cores as much potential to scale as possible.
 
Feb 19, 2009
10,457
10
76
Does the numbers look different if one OC the 78xx with both vram and core @ default to set the benchmark start, then up both at 5% intervals?

Because starting with a 15% vram OC would already give a "OC" performance result already thus the overall scaling % would be lower.

Refer to your own prior testing: http://www.techbuyersguru.com/VRAMocing.php

Both of these cards are vram bandwidth limited as the scaling is great in games with faster memory.
 

Termie

Diamond Member
Aug 17, 2005
7,949
48
91
www.techbuyersguru.com
The images are not showing up for me

Sorry, fixed it.

Does the numbers look different if one OC the 78xx with both vram and core @ default to set the benchmark start, then up both at 5% intervals?

Because starting with a 15% vram OC would already give a "OC" performance result already thus the overall scaling % would be lower.

Refer to your own prior testing: http://www.techbuyersguru.com/VRAMocing.php

Both of these cards are vram bandwidth limited as the scaling is great in games with faster memory.

I was intending to isolate the effect of core overclocking with this analysis. Varying both core and VRAM at the same time would certainly lead to different results, but it would be harder to draw conclusions based on those results. You are right that scaling would be worse had I not OC'd the memory, but showing such results would not give a good indication of what's possible with a comprehensive overclock.

And yes, I agree that both cards are relatively bandwidth-starved. In fact, in Metro2033, VRAM overclocking scales much better than core overclocking. But it is game-dependent.
 
Feb 19, 2009
10,457
10
76
No, I meant that if you bench the card at stock core and vram, then set THAT as the baseline, then up the core and vram, you would see BETTER scaling %.. because what you've done, is started with an OC scenario with the vram, your baseline is HIGHER to start with, so any core increase will appear to scale less as a %.

While doing it your way with cards that are not bandwidth starved is fine, on these cards its not, because upping vram alone already accounted for a significant rise in performance.
 

Jaydip

Diamond Member
Mar 29, 2010
3,691
21
81
Great Job Termie.Just one thing I am interested in, were both the card @ 100% load?
 

3DVagabond

Lifer
Aug 10, 2009
11,951
204
106
LOL, I'm game for that!

Yes, I only have these two cards to test with right now.

By the way, on the issue of the VRAM overclock, I applied the maximum stable VRAM overclock to both cards, by which I mean the highest overclock that achieved positive scaling. In the case of the GTX670, it was about 13.5%, in the case of the HD7870, it was about 16%. I simply intended to give the cores as much potential to scale as possible.

I understand you didn't have one, but it's too bad you couldn't compare the 670 to the 7950.

Edit: As always, good work. :thumbsup:
 

Enigmoid

Platinum Member
Sep 27, 2012
2,907
31
91
This review as a comprehensive article is pointless precisely because the total performance increase from stock is not visible.

On average, we found overclock scaling to average 64 percent on our GTX670 and 52 percent on our HD7870. Thus, with a 15 percent core overclock, the GTX670 gained 9.7 percent in performance, and the HD7870 gained 7.8 percent.

If the 7870 gained a greater amount of fps from the vram overclock then these numbers mean nothing.
 

Olikan

Platinum Member
Sep 23, 2011
2,023
275
126
should have used a 7950.... or a gtx680
the 7870 is the top end of the pitcairn sku...
 
Feb 19, 2009
10,457
10
76
This review as a comprehensive article is pointless precisely because the total performance increase from stock is not visible.

If the 7870 gained a greater amount of fps from the vram overclock then these numbers mean nothing.

That's what I meant, since these SKUs are bandwidth starved, starting with the vram OC as baseline makes their scaling % much lower..

Look at many reviews (and from forum goer results), the scaling above stock for core/vram combined OC is stellar.
 

Termie

Diamond Member
Aug 17, 2005
7,949
48
91
www.techbuyersguru.com
Great Job Termie. Just one thing I am interested in, were both the card @ 100% load?

The only time I saw below 99% load was in Just Cause 2, when it dipped to 96%.

I understand you didn't have one, but it's too bad you couldn't compare the 670 to the 7950.

Edit: As always, good work. :thumbsup:

I wish AMD had a refresh coming out, as I'd definitely grab the 7950 replacement.