- Apr 11, 2004
- 6,298
- 23
- 81
Stumbled across this post in the GPU forum - thought I'd share and elicit some conversation. Make sure you follow the link and read the stuff there.
Originally posted by: Scali
Graphics and GPGPU aren't the same thing. nVidia's G80 pretty much rewrote the book on GPGPU by adding a large shared cache to its shader processors.
This has absolutely no use for graphics, because D3D and OpenGL are designed in a way that each vertex and each pixel is completely independent by definition, and there is no sharing of any data between shaders, ever.
However, when doing GPGPU tasks, you can use the shared memory to have multiple threads communicate with eachother efficiently.
Prior to the 4000-series, ATi GPUs had no shared memory at all. They added it in the 4000-series, but the size is rather limited (boils down to about 128 bytes per thread, compared to nVidia's 512 bytes), as is the bandwidth (about 544GB/s compared to 1,417GB/s on RV790 vs GT200b).
Then I believe there is another limitation in ATi's design... namely that only one thread in every block can write to the shared memory, while the others have read-only access.
All this combined means that ATi cards indeed have some limitations in GPGPU compared to nVidia. This is also apparent in Folding@home for example.
Read this thread for example:
http://foldingforum.org/viewto...p?f=51&t=10442&start=0
It includes comments of people like mhouston, who work for AMD on the Folding@Home client. Basically they're saying that they calculate certain values multiple times because on ATi hardware this is faster than using the shared memory (LDS - Local Data Storage).