What exactly is the relationship between a GPUs memory clock and memory bus width?

ZootAllures91

Junior Member
Dec 7, 2013
10
0
66
For example, the GTX 780 has a 6000 mhz memory clock and a 384-bit memory bus. The GTX 980 has a 7000 mhz memory clock, but only a 256-bit memory bus. By all accounts, the GTX 980 performs objectively better (as would be expected for a newer card.) I'm just curious as to WHY exactly this is the case. What makes 6000 mhz and 384 bits faster than 7000 mhz and 256 bits?

Thanks!
 

el etro

Golden Member
Jul 21, 2013
1,584
14
81
6000MHz of GDDR5 at 384bits channel will be always better than 7000Mhz of GDDR5 at only 256bits channel. But Maxwell uses better the bandwidth(bandwidth is determined by memory clock in MHZ x Bits of the memory channel) it haves, and is faster independently of the memory subsystem performance. Memory performance helps GPU performance, but is not everything. Certain games use better the bandwidth available than others.
 

sheh

Senior member
Jul 25, 2005
247
8
81
Usually you should just look at the bandwidth (speed x width) to gauge quickly which card is faster, especially for the same model, or in the same series. In different models there could be more differences. For example, things like compression can be used to save bandwidth. But still, generally a card with 50GB/s bandwidth is unlikely to be faster than a card with 100GB/s, unless they're many years apart.
 
Last edited:

rgallant

Golden Member
Apr 14, 2007
1,361
11
81
For example, the GTX 780 has a 6000 mhz memory clock and a 384-bit memory bus

but can run at 7000+ if you want to oc.I run mine at 7100 for benches
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
You cant crosscompare like that, since the GTX980 is more efficient with its memory. Even tho the GTX780 got faster memory, the GTX980 GPU is much faster.

Also memory is only one part of the performance part. Memory speed by itself doesnt tell much and there is no golden rule.
 

SunburstLP

Member
Jun 15, 2014
86
20
81
I seem to recall years back that there was a site (was it B3D?) that when benching cards would look at core vs memory speed via clocking. eg. What happens to performance when I drop the core clock and OC memory versus dropping memory and increasing core? It's an interesting insight into bandwidth requirements and efficiency of design. Does anyone still do that?
 

amenx

Diamond Member
Dec 17, 2004
4,406
2,726
136
Its not just Maxwell utilizing mem bandwidth better on smaller bit bus than Kepler. Thats the way its been with almost every new generation of cards, whether nvidia or AMD and newer smaller bit bus cards outperforming older cards. Just that Nvidia has been doing it a bit more with its upper end cards than AMD lately.
 

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
For example, the GTX 780 has a 6000 mhz memory clock and a 384-bit memory bus. The GTX 980 has a 7000 mhz memory clock, but only a 256-bit memory bus. By all accounts, the GTX 980 performs objectively better (as would be expected for a newer card.) I'm just curious as to WHY exactly this is the case. What makes 6000 mhz and 384 bits faster than 7000 mhz and 256 bits?

Thanks!
For one thing, Geforces use compression for (A)RGB, and have improved upon it each generation. There's also the issue that there is limited speed for data transfers within the GPU. There are many small buses and links, and any of them can be a bottleneck. Improving that, without making a hot monster like the GTX 480, is going to take some work.

On top of even that, like CPUs, GPUs have caches, and they get more effective, more efficient, and often larger, each generation. What hits a cache does not need to hit the DRAM, and what can be read from cache can be worked on quickly. Small cache performance improvements can yield large reductions of memory bandwidth needed. Maxwell adds a dedicated 'normal' L1 in front of the shared memory, for example, and quadrupled the L2s. In the case of GPUs, cache helps make SMT more efficient, and should allow better memory write coalescing.

If you trust nVidia's own marketing, they've reduced bandwidth needed for the same pixel pushing, from Kepler to Maxwell by about 25%, on average, while generally having 75-80% the bandwidth available. So, again, if the marketing numbers for bandwidth reduction can be trusted (I doubt it, but they probably aren't too far off), the high-end Maxwells have about the same effective bandwidth as their predecessors.
 

Enigmoid

Platinum Member
Sep 27, 2012
2,907
31
91
For one thing, Geforces use compression for (A)RGB, and have improved upon it each generation. There's also the issue that there is limited speed for data transfers within the GPU. There are many small buses and links, and any of them can be a bottleneck. Improving that, without making a hot monster like the GTX 480, is going to take some work.

On top of even that, like CPUs, GPUs have caches, and they get more effective, more efficient, and often larger, each generation. What hits a cache does not need to hit the DRAM, and what can be read from cache can be worked on quickly. Small cache performance improvements can yield large reductions of memory bandwidth needed. Maxwell adds a dedicated 'normal' L1 in front of the shared memory, for example, and quadrupled the L2s. In the case of GPUs, cache helps make SMT more efficient, and should allow better memory write coalescing.

If you trust nVidia's own marketing, they've reduced bandwidth needed for the same pixel pushing, from Kepler to Maxwell by about 25%, on average, while generally having 75-80% the bandwidth available. So, again, if the marketing numbers for bandwidth reduction can be trusted (I doubt it, but they probably aren't too far off), the high-end Maxwells have about the same effective bandwidth as their predecessors.

In effect Maxwell is almost twice as efficient as Kepler with regards to bandwidth. 336 vs 224 GB/s for 780 ti vs. 980 and the 980 is about 20% faster. 850m with DDR3 screams with 32 GB/sec of bandwidth compared to previous mobile cards (750m DDR3).

Another factor may be latency (more important for compute). Not sure how GDDR5 timings work.