128bit cards:
HD7750 = 512 shaders (Cape Verde)
HD7770 = 640 shaders (Cape Verde)
HD7790 = 260/260X = 896 shaders (Bonaire)
256bit cards:
HD7850 = 265 = 1024 shaders (Pitcairn)
HD7870 = 270 & 270x = 1280 shaders (Pitcairn)
384bit cards:
HD7950 = 280 = 1792 shaders (Tahiti)
HD7970 = 280x = 2048 shaders (Tahiti)
512bit cards:
R290 = 2560 shaders (Hawaii)
R290x = 2816 shaders (Hawaii)
Only Bonaire and Hawaii cores are GCN1.1 (if you're curious, read the 290x review from the front page), the rest are 1.0.
This is a little oversimplified, but:
A 256bit bus has twice the throughput of a 128bit bus (generally speaking) at the same clocks. Bus width is pretty well correlated with the amount of shaders when comparing the same generation of cards from the same company. If you want to have a GPU that's twice as large, you'll need to double everything else feeding it. Note how the 7770 is basically 1/2 a R270x both in shader count and in bus width, which is again half of a R290.
With the 7790 vs 7770, both use a 128bit bus but the 7790 has 40% more shaders. AMD clocked the memory at 1500mhz (6000 effective) vs 1000mhz (4500 effective), giving a 33% increase in memory bandwidth to match. 6000mhz is getting close(r) to the maximum clocks you can get with GDDR5, so to move up to a larger core, they needed to double the bus, and were also able to drop clocks - the 7850 has a memory clock of 1200(4800).
My testing on my 7850 reveals that, in general, 1200 is plenty to feed the GPU. Increasing memory clocks by ~33% to 1600 gives an average framerate increase of less than 5%, while increasing the core clock from 860 to 1125 (~37%) gives close to a 37% improvement in framerates, so even at 1200mhz there's bandwidth to spare for most usage cases, and suggests that once you have enough memory bandwidth to feed the GPU, extra doesn't do much. Even at 1200mhz, the 7850 has around 50% more memory bandwidth per shader than the 7790 due to the larger bus.
You can't exactly compare memory buses between AMD and nVidia, because their GPU designs have different memory bandwidth requirements and the memory controllers from each company are able to extract different amounts of usable bandwidth with a given clock and bus width. It's often still fairly close - a GTX660Ti is competitive with a 7870 (R270) while only having a 192bit memory bus compared to their 256bit bus, but the GTX has higher memory clocks to compensate.