- Nov 22, 2001
- 2,806
- 0
- 0
nVidia makes a good showing with the revised GF104 architecture. Where could they take it? Could they make a flagship level product that's more powerful and efficient than GTX 480?
There are some very smart people here. It would be great to hear what nVidia's options may be. Is a speedy revision possible, how would they design it, what's the power profile look like, etc. We know ATI is working on their next release. What's nVidia's next move? For reference are segments from AT's review. Thanks AT for the excellent review:
"On GF100, there were 4 GPCs each containing a Raster Engine and 4 SMs. In turn each SM contained 32 CUDA cores, 16 load/store units, 4 special function units, 4 texture units, 2 warp schedulers with 1 dispatch unit each, 1 Polymorph unit (containing NVIDIA’s tessellator) and then the L1 cache, registers, and other glue that brought an SM together."
"GF104 in turn contains 2 GPCs, which are effectively the same as a GF100 GPC. Each GPC contains 4 SMs and a Raster Engine. However when we get to GF104’s SMs, we find something that has all the same parts as a GF100 SM, but in much different numbers. NVIDIA beefed up the number of various execution units per SM. The 32 CUDA cores from GF100 are now 48 CUDA cores, while the number of SFUs went from 4 to 8 along with the texture units. As a result, per SM GF104 has more compute and more texturing power than a GF100 SM. This is how a “full” GF104 GPU has 384 CUDA cores even though it only has half the number of SMs as GF100."
There are some very smart people here. It would be great to hear what nVidia's options may be. Is a speedy revision possible, how would they design it, what's the power profile look like, etc. We know ATI is working on their next release. What's nVidia's next move? For reference are segments from AT's review. Thanks AT for the excellent review:
"On GF100, there were 4 GPCs each containing a Raster Engine and 4 SMs. In turn each SM contained 32 CUDA cores, 16 load/store units, 4 special function units, 4 texture units, 2 warp schedulers with 1 dispatch unit each, 1 Polymorph unit (containing NVIDIA’s tessellator) and then the L1 cache, registers, and other glue that brought an SM together."

"GF104 in turn contains 2 GPCs, which are effectively the same as a GF100 GPC. Each GPC contains 4 SMs and a Raster Engine. However when we get to GF104’s SMs, we find something that has all the same parts as a GF100 SM, but in much different numbers. NVIDIA beefed up the number of various execution units per SM. The 32 CUDA cores from GF100 are now 48 CUDA cores, while the number of SFUs went from 4 to 8 along with the texture units. As a result, per SM GF104 has more compute and more texturing power than a GF100 SM. This is how a “full” GF104 GPU has 384 CUDA cores even though it only has half the number of SMs as GF100."
