• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

WCCftech: Memory allocation problem with GTX 970 [UPDATE] PCPer: NVidia response

Page 3 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Status
Not open for further replies.
I haven't read this very closely but if there's a benchmark with varying memory usage, then you should see about the same variance in the 970s and 980s. Is that happening? Does anyone see the same benchmark use between 3.5-4GB on 970 and 980 using the same settings with multiple runs?
 
I haven't read this very closely but if there's a benchmark with varying memory usage, then you should see about the same variance in the 970s and 980s. Is that happening? Does anyone see the same benchmark use between 3.5-4GB on 970 and 980 using the same settings with multiple runs?

That's the thing. The claim is the 970 isn't 256bit and because of that it can't use the full 4GB. They wanted to use a synthetic benchmark to prove it wasn't 256bit yet ignored the fact that a 980 has a higher performance at stock settings and accounts for the performance differential. The benchmark also has a 6Ghz speed limit in place. Then we have claims that games can't load to 4GB on the 970. Clearly they can but it requires higher than 1080p for me to hit that mark. A user on a forum or comment section claiming their 980 is using 4GB at 1080p when I need to go above 1440p to get that same usage, doesn't add up to me. Especially so when the supposed proof shows two differing scenes. It's not a benchmark that runs the exact same scene, it's a dynamic setting that often times isn't even taken at the same spot.
 
970 is different from 980 in terms of memory bandwidth.

http://techreport.com/blog/27143/here-another-reason-the-geforce-gtx-970-is-slower-than-the-gtx-980

3dm-color.gif


Then, last week, an email hit my inbox from Damien Triolet at Hardware.fr, one of the best GPU reviewers in the business. He offered a clear and concise explanation for these results—and in the process, he politely pointed out why our numbers for GPU fill rates have essentially been wrong for a while. Damien graciously agreed to let me publish his explanation:

For a while, I've thought I should drop you an email about some pixel fillrate numbers you use in the peak rates tables for GPUs. Actually, most people got those numbers wrong as Nvidia is not crystal clear about those kind of details unless you ask very specifically.

The pixel fillrate can be linked to the number of ROPs for some GPUs, but it’s been limited elsewhere for years for many Nvidia GPUs. Basically there are 3 levels that might have a say at what the peak fillrate is :

The number of rasterizers
The number of SMs
The number of ROPs
On both Kepler and Maxwell each SM appears to use a 128-bit datapath to transfer pixels color data to the ROPs. Those appears to be converted from FP32 to the actual pixel format before being transferred to the ROPs. With classic INT8 rendering (32-bit per pixel) it means each SM has a throughput of 4 pixels/clock. With HDR FP16 (64-bit per pixel), each SM has a throughput of 2 pixels/clock.

On Kepler each rasterizer can output up to 8 pixels/clock. With Maxwell, the rate goes up to 16 pixels/clock (at least with the currently released Maxwell GPUs).

So the actual pixels/cycle peak rate when you look at all the limits (rasterizers/SMs/ROPs) would be :

GTX 750 : 16/16/16
GTX 750 Ti : 16/20/16
GTX 760 : 32/24/32 or 24/24/32 (as there are 2 die configuration options)
GTX 770 : 32/32/32
GTX 780 : 40/48/48 or 32/48/48 (as there are 2 die configuration options)
GTX 780 Ti : 40/60/48
GTX 970 : 64/52/64
GTX 980 : 64/64/64

Extra ROPs are still useful to get better efficiency with MSAA and so. But they don’t participate in the peak pixel fillrate.

That’s in part what explains the significant fillrate delta between the GTX 980 and the GTX 970 (as you measured it in 3DMark Vantage). There is another reason which seem to be that unevenly configured GPCs are less efficient with huge triangles splitting (as it’s usually the case with fillrate tests).

So the GTX 970's peak potential pixel fill rate isn't as high as the GTX 980's, in spite of the fact that they share the same ROP count, because the key limitation resides elsewhere. When Nvidia hobbles the GTX 970 by disabling SMs, the effective pixel fill rate suffers.

This is likely what is being seen.
 
Ok yes, that is obvious...performance is cut on purpose because it is a lower cost part. However, overclocking can reduce the advantage the 980 has in real world usage. Aside from that, this doesn't address the "GTX 970 cannot use all of it's 4GB of memory" question.

My thing is that if this was a big problem Anandtech, HardOCP, Guru3D, TechPowerUp and users on this very forum would be making vocal complaints about games stuttering and hitching when they hit the VRAM wall. This Vram wall would be below 4GB if the 970 couldn't use the entirety of it's available memory and clearly that isn't the case.
 
Just out of curiosity, what is your VRAM usage at 1080p 8x MSAA and max details in Far Cry 4? It should match my experience with my 970 which places usage well below 4GB, but above 3GB.

3450MB, give or take. But some 300MB of that is areo.
 
Last edited:
oUjAyS8.jpg


Quick note though, for this test my graphics memory clock only went up to 6000MHz, then bumped up to 7000MHz at the last couple seconds.

You don't think underclocking memory is a flaw with the test?
Its probably running the 980/970 in P02 clock state which is 6k verse 7k for P00 .

I believe the same thing happens during folding, mem runs 6k
 
Last edited:
GTX 970 is actually slower than the 980 in the memory bandwidth, but not because it doesn't have 256bit memory interface, that is pretty stupid, its because of how its designed.

So yeah, you are right in that GTX 970 is slower at the same settings and same resolution and same clocks in the memory department vs a GTX 980. This isn't due to less than 256bit, but due to how the Maxwell architecture is designed.
 
GTX 970 is actually slower than the 980 in the memory bandwidth, but not because it doesn't have 256bit memory interface, that is pretty stupid, its because of how its designed.

So yeah, you are right in that GTX 970 is slower at the same settings and same resolution and same clocks in the memory department vs a GTX 980. This isn't due to less than 256bit, but due to how the Maxwell architecture is designed.
Ok but why GTX970 uses less memory in same game scene and settings vs GTX980?
Look at far cry4
 
Ok but why GTX970 uses less memory in same game scene and settings vs GTX980?
Look at far cry4

Your link shows no numbers on the 970 if you refer to post 60. cmdrdredd and I tested it with the same settings, GTX970 and GTX980. Same memory usage.
 
Are you blind?
look at right upper corner
Scene1
GTX980 3918MB
GTX970 3517MB
scene2
GTX980 4035MB
GTX970 3488MB
 
Last edited:
Are you blind blind?
look at right upper corner

Oh the white text on the sky. Are you comparing apples and oranges? Seems ts a topic you are very dedicated to on several forums.

Everything on max, I get around 3450MB usage and I got a GTX980.
 
Last edited:
You didnt post single proof nothing.
I post screenshot comparison in same scene with same settings
end of discussion.
And of course its version 1.6
 
Last edited:
You didnt post single proof nothing.
I post screenshot comparison in same scene with same settings
end of discussion.
And of course its version 1.6
I don't buy what you're selling to begin with, but I also don't know how you're trying to convince us that these screenshots are using same settings:

http://i.imgur.com/5Q1yW6k.jpg
http://i.imgur.com/QkdreNg.jpg

I've never played Watch Dogs, so it may be possible that they're using the same settings, but it's clear that the shadows in the environment are completely different meaning that it's not a direct comparison.
 
Is my GTX980 broken too arcording to you? Or are you trying to compare on different PCs with different setups. Aero being one of them. And I assume all the tests you run are done with the 1.6 patch?

If aero was using the rest of the memory, the total would be showing on the OSD.

In my opinion, maybe I'm wrong, but if the game is not using the full 4GB it is because it is not needing it.

This should be tested at 4k, and not 1080p.
 
If aero was using the rest of the memory, the total would be showing on the OSD.

In my opinion, maybe I'm wrong, but if the game is not using the full 4GB it is because it is not needing it.

This should be tested at 4k, and not 1080p.

You're right. If the game needs it, it loads it up. Someone is obviously running a card above 1080p and claiming otherwise in my view. ShintaiDK and myself both noted similar usage with our cards and no site I found has noted wonky memory usage with the numerous GTX 970 reviews out there. If I go above 1440p I can approach 4GB of memory usage. Other than that, the games simply don't need it and someone is trying to pull one over on everyone to illustrate a problem that isn't there.
 
Status
Not open for further replies.
Back
Top