HD7000 Series VRAM Overclocking Benchmarks - a "memory hole" at 1300MHz?

Termie

Diamond Member
Aug 17, 2005
7,949
48
91
www.techbuyersguru.com
UPDATE: I've published an article on my website exploring the issue of VRAM overclocking. It's available here: http://www.techbuyersguru.com/VRAMocing.php

--------------

This is a topic I haven't seen discussed much in the forums, and I thought it might give people some food for thought. Just as background, most current video cards use error correction for the VRAM, which means that if it the memory is overclocked too high, errors will occur that need to be corrected, which will lead to slower performance, but not necessarily a video card crash.

So, while testing my HD7870 using 3dMark's new Fire Strike benchmark, I discovered a quirk in my HD7870's behavior. With a minor memory overclock to 1300MHz, performance became worse, but with additional overclocking, the performance actually began scaling positively again. I decided to test it with several other programs (3dMark11 and Hitman: Absolution), and the "memory hole" showed up in the exact same place, and again, performance picked up again with additional overclocking. The scaling continues through 1450MHz, but the card crashes at 1500MHz. Here are the results:

hd7870memoryscalinganal.jpg


capturebacjc.jpg



So, three big questions for everyone to ponder:
(1) is the dip at 1300MHz due to memory error correction?
(2) is the increase in performance after 1350MHz due to the memory overclock being fast enough to compensate for the errors, or are no errors occurring?
(3) could forcing error correction through overclocking lead to damage to the card over the long-term even if performance is in fact better right now?

EDIT: There seems to a growing consensus that the card may be automatically loosening the timings of the VRAM under overclocking conditions. Another theory proposed below is that it may have to do with whether the core and VRAM timings are in sync.

UPDATE: data from user hyrule4927's HD7950 tested in Unigine Valley 1.0

GraphsBetter_zpseef699fc.png

-------------------------------------------------------------------------------------------------------------------------------------

As a counterpoint, I've done similar testing on my GTX670. It doesn't show any dips, but it does have interesting plateaus at 6300, 6600, and 6900, and then really begins to cap out after 7000.

capturecdc.jpg
 
Last edited:

tigersty1e

Golden Member
Dec 13, 2004
1,963
0
76
that is awesome.

as you continue increasing your memory clocks past 1450, does your card crash or do your performance numbers dip again?
 

Elcs

Diamond Member
Apr 27, 2002
6,278
6
81
This is something I have never heard of before and very interesting. I wish I could help answer some of your questions but I'm afraid I'm just going to have to be an observer.

Thank you for bringing this to light.
 

Termie

Diamond Member
Aug 17, 2005
7,949
48
91
www.techbuyersguru.com
that is awesome.

as you continue increasing your memory clocks past 1450, does your card crash or do your performance numbers dip again?

Thanks for the question. I have not tried anything above 1450 because that's the limit in CCC and Afterburner. I'll need to figure out the software unlock to push higher. I think that before I do that, I'll test a few more games at 1200 through 1450 to see if any start dropping again at the 1450 limit.
 

Rikard

Senior member
Apr 25, 2012
428
0
0
Thanks for the question. I have not tried anything above 1450 because that's the limit in CCC and Afterburner. I'll need to figure out the software unlock to push higher. I think that before I do that, I'll test a few more games at 1200 through 1450 to see if any start dropping again at the 1450 limit.
You should be able to go higher in Sapphire Trixx. I personally started seeing artifacts at 1500 MHz so I did not push it until it crashed.

I think this is really interesting but like other posters I really cannot give good input since I lack the required knowledge on this topic. It would be nice to see this verified with other cards though. You have a 670, could you see if there is a similar dip there? Also, could you please include VRM temps as another graph?
 

Termie

Diamond Member
Aug 17, 2005
7,949
48
91
www.techbuyersguru.com
You should be able to go higher in Sapphire Trixx. I personally started seeing artifacts at 1500 MHz so I did not push it until it crashed.

I think this is really interesting but like other posters I really cannot give good input since I lack the required knowledge on this topic. It would be nice to see this verified with other cards though. You have a 670, could you see if there is a similar dip there? Also, could you please include VRM temps as another graph?

Good idea - I'll try Trixx. I actually have it installed already.

Do you know a way to log VRM temps? Perhaps GPU-z? I've just never tried it. I know the GPU temp hasn't exceeded 67C in my testing.

I have a few more benchmarks I'm going to try on my HD7870, and then I'll get around to testing my GTX670 more thoroughly. From what I've seen so far, there is no drop in performance between 6000MHz and 6800Mhz on the VRAM, but rather a plateau in performance after 6500MHz, which I have assumed could be error correction kicking in.

I have never actually witnessed any artifacts due to memory overclocking so far on either card, but then again I haven't pushed them past where I'd seen other reviewers OC the memory to.
 

Solomutt

Junior Member
May 18, 2012
11
0
0
Just a guess, but the card may relax timings past a certain clock, automatically (think driver talking to the HW).....
 

felang

Senior member
Feb 17, 2007
594
1
81
Very Interesting. I´ll test my 680, I´ve been meaning to set find my max overclock now that I´m on water cooling anyway. What benchmark do you guys think is the most consistent to test mem overclock?
 

Zanovar

Diamond Member
Jan 21, 2011
3,446
232
106
Hmm,good stuff Termie,and great questions,anyone have a clue?im away from desktop for months at a time will mess about when back.
 

Termie

Diamond Member
Aug 17, 2005
7,949
48
91
www.techbuyersguru.com
Just a guess, but the card may relax timings past a certain clock, automatically (think driver talking to the HW).....

Relaxed timings would do it, but I don't see how the card would automatically adjust them - that would be pretty sophisticated. It's thought that looser timings from the factory are the reason the GTX670 4GB is slower than the GTX670 2GB in most benchmarks, but those are fixed timings, not dynamic timings.

Hmm,good stuff Termie,and great questions,anyone have a clue?im away from desktop for months at a time will mess about when back.

I'm going to run two or three different benchmarks tonight if I get the chance (Batman: AC, GTA: IV, and Unigine Heaven 4.0) to see if I can pick up the same trend.

Just to be clear, I haven't seen this yet on my GTX670, so I'm not expecting others to necessarily find it on their cards. I just thought it was interesting, and may at least put people on notice to watch their memory overclock scaling carefully.
 

BoFox

Senior member
May 10, 2008
689
0
0
Excellent find, Termie! That's a hefty overclock on memory you got there - a 1GHz increase with further gains.

Error correcting usually comes in much later on, and would give diminishing returns, where everything just becomes slower and slower until the GPU can no longer handle too many errors - usually resulting in a hard freeze/corruption. It seems that some driver revisions allow for more error correcting to take a place before a "crash" takes place (allowing it to be clocked further).

It certainly does seem like as if it's the timings that are automatically relaxed, to allow for further overclocking. Logically, it has to be, since you're able to overclock so much further than that. If it were error correction already kicking in, then the amount of errors would have exponentially increased as you moved the clocks further up. There's just no way that the performance could have continued to increase that way, if errors caused such a slowdown to begin with, along with an exponential increase of errors being caused by the hardware limitation (memory clock/timings and voltage) as you pushed it that far.

Hope this helps, man. I remember X1950XTX with GDDR4 memory having slightly different timings than X1900XTX with DDR3. Can't remember all other cards though, but I think I do remember one case where overclocking the memory automatically relaxed the timings. Anybody else remember an article about it? Just wondering.. maybe google's my real good friend today...
 

KingFatty

Diamond Member
Dec 29, 2010
3,034
1
81
I'm curious if it starts to decrease when you continue overclocking the RAM to higher frequencies. Typically what you'd expect, maybe you still have room for higher performance and still haven't found the ceiling?
 

chimaxi83

Diamond Member
May 18, 2003
5,457
63
101
If your cards monitoring chip supports it, you can monitor VRM temps with HWiNFO64.
 

Termie

Diamond Member
Aug 17, 2005
7,949
48
91
www.techbuyersguru.com
Hey all - I added two more benchmarks, Batman:AC and Just Cause 2. They show exactly the same pattern. I also tested GTA:IV but quickly realized that it's a totally CPU-limited game engine, so the results were worthless.

I think I'm going to stop here for now - seems as if it's just a quirk in the memory subsystem, and maybe not actual error correction, explained perhaps below.

If your cards monitoring chip supports it, you can monitor VRM temps with HWiNFO64.

I'll give that a try, although I don't think overheating is causing this.

I'm curious if it starts to decrease when you continue overclocking the RAM to higher frequencies. Typically what you'd expect, maybe you still have room for higher performance and still haven't found the ceiling?

Yeah, I haven't really found the ceiling yet, although I've definitely found a pattern.

Excellent find, Termie! That's a hefty overclock on memory you got there - a 1GHz increase with further gains.

Error correcting usually comes in much later on, and would give diminishing returns, where everything just becomes slower and slower until the GPU can no longer handle too many errors - usually resulting in a hard freeze/corruption. It seems that some driver revisions allow for more error correcting to take a place before a "crash" takes place (allowing it to be clocked further).

It certainly does seem like as if it's the timings that are automatically relaxed, to allow for further overclocking. Logically, it has to be, since you're able to overclock so much further than that. If it were error correction already kicking in, then the amount of errors would have exponentially increased as you moved the clocks further up. There's just no way that the performance could have continued to increase that way, if errors caused such a slowdown to begin with, along with an exponential increase of errors being caused by the hardware limitation (memory clock/timings and voltage) as you pushed it that far.

Hope this helps, man. I remember X1950XTX with GDDR4 memory having slightly different timings than X1900XTX with DDR3. Can't remember all other cards though, but I think I do remember one case where overclocking the memory automatically relaxed the timings. Anybody else remember an article about it? Just wondering.. maybe google's my real good friend today...

It would be pretty surprising that memory timings change, but it seems as if there's a precedent for it.
 

Rikard

Senior member
Apr 25, 2012
428
0
0
I ran some tests using Unigine Valley 1.0. I kept the voltage at 0.95 V in Trixx (which reads as about 1.03 V in GPU-Z), and I kept the core clock at 1175 MHz and powertune at +20%. Since did not see any artifacts I took the memory as high as it would go and it "crashed" at 1675 MHz.
oimg

oimg

oimg

oimg

Min and max are suffering too much from random fluctuations, but the score and average FPS reproduces the OP's dip at 1300 MHz. The region around 1500 MHz is very interesting: I have a peak at 1500, but it dips immediately at 1525 MHz. It does not really plateau until I get very close to the maximum frequency.
 

Termie

Diamond Member
Aug 17, 2005
7,949
48
91
www.techbuyersguru.com
I ran some tests using Unigine Valley 1.0. I kept the voltage at 0.95 V in Trixx (which reads as about 1.03 V in GPU-Z), and I kept the core clock at 1175 MHz and powertune at +20%. Since did not see any artifacts I took the memory as high as it would go and it "crashed" at 1675 MHz.
oimg


...

Min and max are suffering too much from random fluctuations, but the score and average FPS reproduces the OP's dip at 1300 MHz. The region around 1500 MHz is very interesting: I have a peak at 1500, but it dips immediately at 1525 MHz. It does not really plateau until I get very close to the maximum frequency.

Thank you for posting this. I really appreciate having more data to share with the overclocking community. I hope you don't mind that I've posted your Avg FPS graph to the OP, so people see it right away. By the way, I would just throw out the min and max data - I find it completely erratic. But as you note, the average data is quite repeatable, and in my opinion, reliable.

What were the increments you used to test the memory? The graph's x-axis increments make it hard to tell, but I assume it was every 50MHz from 1200MHz to 1650MHz.

Now, what do people think about this finding? Two separate testers, two different cards, numerous different benchmarks, all showing the same pattern. At 1300MHz, performance drops. That is just uncanny - and it occurs despite the different memory subsystem on the HD7870 and HD7950. It seems there is something hard-coded into the memory chips themselves.

By the way, I tried to push my memory to 1500MHz, and it crashed, so perhaps "error correction" kicks in around 1450 on my card, where there is a very slight plateau, and then just won't run when pushed higher.
 
Last edited:

Rikard

Senior member
Apr 25, 2012
428
0
0
Thank you for posting this. I really appreciate having more data to share with the overclocking community. I hope you don't mind that I've posted your Avg FPS graph to the OP, so people see it right away. By the way, I would just throw out the min and max data - I find it completely erratic. But as you note, the average data is quite repeatable, and in my opinion, reliable.

What were the increments you used to test the memory? The graph's x-axis increments make it hard to tell, but I assume it was every 50MHz from 1200MHz to 1650MHz.
I do not mind at all! The reason why min and max are not very reliable is that it just needs one extreme case to influence the graph, while the average is made out of lots of points.

These are the raw data:
Memory [MHz] Score Average FPS Min FPS Max FPS
1250 1627 38.9 19.3 74.7
1300 1617 38.7 18.1 74
1350 1646 39.3 19.6 76
1400 1659 39.7 19.8 76.9
1450 1679 40.1 18.5 78.7
1475 1686 40.3 19.9 77.2
1500 1703 40.7 20 80.4
1525 1687 40.3 20 78.4
1550 1695 40.5 20.4 77.9
1575 1704 40.7 20.3 78.8
1600 1711 40.9 20.3 78.3
1625 1725 41.2 20.8 80.4
1650 1724 41.2 20.4 80
 

arh2o

Member
Jan 22, 2005
54
0
66
Hmm, so I guess it looks like around 1375 is the sweet spot for a semi-conservative memory OC.
 

hyrule4927

Senior member
Feb 9, 2012
359
1
76
Very interesting. I've barely pushed my memory because of the performance drop around 1350, maybe I should see what happens when I try a higher speed.

Of course it is also possible that I just have badly overclocking memory. :p
 

Rhezuss

Diamond Member
Jan 31, 2006
4,118
34
91
Damn great post Termie! This is very useful informations for OCers.
Thanks for taking the time to tests this and post results for us!
 

EXCellR8

Diamond Member
Sep 1, 2010
4,005
848
136
Interesting finds indeed. I haven't tried overclocking my 7970 at all but it's of reference design and I didn't feel that I needed a boost. This has me thinking though if I could find and get past a 'hole' with the memory, maybe I could squeeze some additional power out of it. I would also be interested in comparing results vs. a card with the GHZ edition BIOS.
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
Wasn't there discussion a long time ago about memory ratios?

If I recall it showed that performance dropped at certain points because the memory clock was out of sync with the GPU clock. It was found at the time that performance gains from memory overclocking went in steps - i.e. 1200-1250 would all show the same performance, and there would be a drop at 1250 - 1300. Then 1300 - 1350 would all be the same performance again.

Note those are just numbers I made up for illustration purposes.

If I recall correctly this was found in a discussion of Nvidia cards if somebody wants to try to find it.
 

Termie

Diamond Member
Aug 17, 2005
7,949
48
91
www.techbuyersguru.com
Wasn't there discussion a long time ago about memory ratios?

If I recall it showed that performance dropped at certain points because the memory clock was out of sync with the GPU clock. It was found at the time that performance gains from memory overclocking went in steps - i.e. 1200-1250 would all show the same performance, and there would be a drop at 1250 - 1300. Then 1300 - 1350 would all be the same performance again.

Note those are just numbers I made up for illustration purposes.

If I recall correctly this was found in a discussion of Nvidia cards if somebody wants to try to find it.

Very interesting theory, but would that mean that the dip in memory performance would change with core clock? If so, we're not seeing that in our benches. The dip on my card is at 1300MHz VRAM, at both 1000MHz and 1150MHz core clocks. And Rikard has an 1175MHz core and finds the same dip...although he appears to have two dips, which I might also if I had more headroom to OC the memory.

Still, I am intrigued. I might try this on my GTX670 this weekend, since your theory would suggest that this is not limited to HD7000 cards or the specific memory chips used on them.
 

tigersty1e

Golden Member
Dec 13, 2004
1,963
0
76
you should also test frame latency (stuttering).

the avg fps may be going up, but at an expense in smoothness.