For one thing, Geforces use compression for (A)RGB, and have improved upon it each generation. There's also the issue that there is limited speed for data transfers within the GPU. There are many small buses and links, and any of them can be a bottleneck. Improving that, without making a hot monster like the GTX 480, is going to take some work.
On top of even that, like CPUs, GPUs have caches, and they get more effective, more efficient, and often larger, each generation. What hits a cache does not need to hit the DRAM, and what can be read from cache can be worked on quickly. Small cache performance improvements can yield large reductions of memory bandwidth needed. Maxwell adds a dedicated 'normal' L1 in front of the shared memory, for example, and quadrupled the L2s. In the case of GPUs, cache helps make SMT more efficient, and should allow better memory write coalescing.
If you trust nVidia's own marketing, they've reduced bandwidth needed for the same pixel pushing, from Kepler to Maxwell by about 25%, on average, while generally having 75-80% the bandwidth available. So, again, if the marketing numbers for bandwidth reduction can be trusted (I doubt it, but they probably aren't too far off), the high-end Maxwells have about the same effective bandwidth as their predecessors.