WCCftech: Memory allocation problem with GTX 970 [UPDATE] PCPer: NVidia response

Page 9 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Status
Not open for further replies.

ViRGE

Elite Member, Moderator Emeritus
Oct 9, 1999
31,516
167
106
But given that VRAM usage DOES show the same results we're seeing...this makes me suspicious. Can anyone comment on what section of VRAM Windows would use first? Would it allocate off the bottom of the VRAM stack?
VRAM allocation at the application level is virtualized. The GPU drivers will give Windows (or any other application) its own memory space, and then allocate physical RAM based on their own algorithms. So Windows can (and does) end up anywhere.

The reason that the program in question always shows memory bandwidth falling near the end is because it's filling up its memory allocation chunk by chunk. It has to fill the 3GB+ before the physical VRAM is maxed out and spills over to system RAM.
 
Last edited:

jj109

Senior member
Dec 17, 2013
391
59
91
This benchmark is adding ten ones to a large contiguous array of 4x1 floating point vectors which are initialized to 0. This benchmark is SMM bottlenecked until the GTX 970 hits the upper quarter of its address space.

The GB/s designation from the benchmark should really be GFLOPS. The GTX 970 gets ~3.75 GFLOPS until the upper quarter, and the GTX 980 gets ~4.8 GFLOPS. This is close enough to the theoretical maximum that we can conclude that it's shader bottlenecked.

The best way to figure out what's going is to look at a CUDA profiler to see why the addition is taking so much more time for the upper GB of the address space. I also don't have a GTX 970 so if anyone of you want to install the CUDA toolkit and run a profiler on the benchmark, that'd be great. Either way I think we'll get an answer soon since some poor CUDA dev at Nvidia is probably working weekends because of this :hmm:
 

Cloudfire777

Golden Member
Mar 24, 2013
1,787
95
91
A) People don`t know what this benchmark does. Does it test the VRAM like a game would do? Can it overload the bus? Why is memory bandwidth lower than what the cards (not just 970) should have in the early stages of the test? Why does the L2 cache suffer as well when its not part of the VRAM and memory bus at all and is located on the die isolated. Ton of unknowns here.

B) No reviewers found anything wrong with GTX 970 when testing the card on various resolutions that would easily go past 3GB usage. We are talking many many reviews. Why didnt they notice anything? They would undoubtly do that if the bandwidth goes down to 22GB/s.

C) No users have experienced any problems gaming with GTX 970. Until this rumor started and people started LOOKING for it. With a benchmark they don`t know anything about.

D) GTX 770, GTX Titan, GTX 980, GTX 970, many cards, not just GTX 970, get bad "results" with this benchmark. Which means many Nvidia cards are broken?
Right....
 
Last edited:

RampantAndroid

Diamond Member
Jun 27, 2004
6,591
3
81
VRAM allocation at the application level is virtualized. The GPU drivers will give Windows (or any other application) its own memory space, and then allocate physical RAM based on their own algorithms. So Windows can (and does) end up anywhere.

The reason that the program in question always shows memory bandwidth falling near the end is because it's filling up its memory allocation chunk by chunk. It has to fill the 3GB+ before the physical VRAM is maxed out and spills over to system RAM.

So are you suggesting the slowdown is due to system memory being allocated as virtual VRAM (and hence the slowdown)? Is there a tool from Nvidia (or anyone) to view the current VRAM that has been allocated?
 

Abwx

Lifer
Apr 2, 2011
11,885
4,873
136
B) No reviewers found anything wrong with GTX 970 when testing the card on various resolutions that would easily go past 3GB usage. We are talking many many reviews. Why didnt they notice anything? They would undoubtly do that if the bandwidth goes down to 22GB/s.

And yet they knew but kept silent in the waiting of Nvidia answer, once someone posted about the issue at hardware.fr the CPU reviewer of this site stated that he knew about it and that it s some time that he asked their GPU reviewer to check the thing with Nvidia, so far they got no answer...
 

Cloudfire777

Golden Member
Mar 24, 2013
1,787
95
91
And yet they knew but kept silent in the waiting of Nvidia answer, once someone posted about the issue at hardware.fr the CPU reviewer of this site stated that he knew about it and that it s some time that he asked their GPU reviewer to check the thing with Nvidia, so far they got no answer...

That would show up on their review regardless but it didnt. Bandwidth down to 22GB/s would mean a massive drop in performance
 

ViRGE

Elite Member, Moderator Emeritus
Oct 9, 1999
31,516
167
106
So are you suggesting the slowdown is due to system memory being allocated as virtual VRAM (and hence the slowdown)? Is there a tool from Nvidia (or anyone) to view the current VRAM that has been allocated?
Correct. If this tool is trying to allocate 4GB but only 3.5GB of physical VRAM is available for any given reason, then the last 512MB would need to spill out. It's clearly not blocked by physical VRAM size, as otherwise the program would hard fail on cards smaller than 4GB*.

* There are ways in CUDA to disallow a program from spilling over. NVIDIA's BandwidthTest sample does this, for example
 

RaulF

Senior member
Jan 18, 2008
844
1
81
That would show up on their review regardless but it didnt. Bandwidth down to 22GB/s would mean a massive drop in performance

I would not bet on it.

You know there are plenty of reviews out there that will put a product (either game or hardware) on a pedestal. But when the end user gets their hands on it is far from what the review stated. And most of the time they are just testing FPS numbers and image quality. Very rarely will they cover instability.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
146
106
Remember on a normal GTX980, running in normal mode. Block 27-29 also "fails".

A fresh run here on GTX980 using 347.09 driver:
rec4.jpg
 
Last edited:

caswow

Senior member
Sep 18, 2013
525
136
116
if my gpu says 4gb 255gb/s then i want it all the way up to 4 gb to be 255gb/s. of course things have to be proven right.
 

psolord

Platinum Member
Sep 16, 2009
2,142
1,265
136
Can I use this test on my 570s to see if it works correctly, if it hasn't been done already?

What must I download?
 

Deders

Platinum Member
Oct 14, 2012
2,401
1
91
Has anyone tried the 'net stop Themes' command to disable Aero completely? I did and on my 780 it gave me a complete run of over 333+ GB/s whereas before the last 2 chunks were massively limited.

net start Themes will get Aero back. This way free's up much more memory than simply disabling Glass.

You can either make a batch file for each command or run it in CMD prompt.
 

cmdrdredd

Lifer
Dec 12, 2001
27,052
357
126
Not if you stick to the guidelines for reviewing the card ;)

Geeze people remove the tinfoil hat. If there was memory bandwidth dropping like a rock during gaming the fps results would clearly show a problem. Reviewers didn't hide the numbers like you're suggesting. Hell one site did framepacing tests (I forget which when I was browsing around) and noted that in SLI the 970 actually was smoother than other single cards at times. If there was a memory issue that wouldn't happen either because you'd have fps drops out the wazoo.
 
Last edited:

mikk

Diamond Member
May 15, 2012
4,308
2,395
136
Remember on a normal GTX980, running in normal mode. Block 27-29 also "fails".

A fresh run here on GTX980 using 347.09 driver:


Your ignorance is annoying. If you don't get full speed over the entire VRAM on a GTX 980 then you tested it wrong. Use you iGPU For goodness' sake and stop your nonsense. GTX 980 is not affected, my goodness...

Warning issued for personal attack.
-- stahlhart
 
Last edited by a moderator:

ShintaiDK

Lifer
Apr 22, 2012
20,378
146
106
Your ignorance is annoying. If you don't get full speed over the entire VRAM on a GTX 980 then you tested it wrong. Use you iGPU For goodness' sake and stop your nonsense. GTX 980 is not affected, my goodness...

Take your bad attitude somewhere else.

Even with IGP I can still get up to 2 blocks "failed" depending on the testrun.

You can try to run the process explorer using the GPU graph on the rec.exe process while you run it.
 
Last edited:

Cloudfire777

Golden Member
Mar 24, 2013
1,787
95
91

stahlhart

Super Moderator Graphics Cards
Dec 21, 2010
4,273
77
91
Keep the debate in this thread civil. Agree to disagree otherwise.
-- stahlhart
 

Spanners

Senior member
Mar 16, 2014
325
1
0
A) People don`t know what this benchmark does. Does it test the VRAM like a game would do? Can it overload the bus? Why is memory bandwidth lower than what the cards (not just 970) should have in the early stages of the test? Why does the L2 cache suffer as well when its not part of the VRAM and memory bus at all and is located on the die isolated. Ton of unknowns here.

B) No reviewers found anything wrong with GTX 970 when testing the card on various resolutions that would easily go past 3GB usage. We are talking many many reviews. Why didnt they notice anything? They would undoubtly do that if the bandwidth goes down to 22GB/s.

C) No users have experienced any problems gaming with GTX 970. Until this rumor started and people started LOOKING for it. With a benchmark they don`t know anything about.

D) GTX 770, GTX Titan, GTX 980, GTX 970, many cards, not just GTX 970, get bad "results" with this benchmark. Which means many Nvidia cards are broken?
Right....

A) The post above yours explains what the benchmark does pretty concisely. The source code is here.

B) The bandwidth drop off is around 3300MiB which is roughly 3.5GB. Maybe the drivers are allocating ram differently for the 970. It would explain why people were seeing different usage at the same settings/resolution vs the 980. Maybe it has little real-world consequences besides some very limited scenarios.

C) I've seen a number of posts in the Nvidia forums stating that stuttering occurs at higher ram usage. Can't say if these are accurate or related though.

D) People have been running this benchmark incorrectly, it's easy to create a false result. I've yet to see a single 970 the doesn't exhibit this behavior though. If the benchmark was that erratic surely we'd see some results that were "good" for the 970? Nobody said the cards were broken.

Lots of potential outcomes from the extremely mundane to some juicy drama.
 
Last edited:

sontin

Diamond Member
Sep 12, 2011
3,273
149
106
And it has nothing to do with the problem.

A GTX980m can only process 48 pixel/clock, less than a GTX970. However in the Cuda benchmark all memory partitions are available at full speed.
 
Status
Not open for further replies.