What is CUDA/stream processors explained?

Nukemann

Junior Member
Mar 29, 2014
18
0
0
Pardon my n00bish question, but what exactly is CUDA or stream processing? Doesn't it sound a bit similar to Intel's Hyper-threading, where the CPU can use virtual cores? A GPU with CUDA/stream processors has hundreds or thousands of "cores" or processing units? Is each CUDA on a GPU like a "lane" on an HT processor, which is supposed to make more use of the processor? The difference I see is the GPU stream processors use a proprietary API, like CUDA, but what's stopping Intel or AMD from making an x86 CPU that has hundreds of those "cores" and running several times the clock speed?
 
Feb 25, 2011
16,964
1,597
126
Yes, a GPU has hundreds or thousands of individual processing units, but they are much smaller and more limited than the general-purpose cores in a CPU.

Intel tried exactly that with something called the Xeon Phi. (something like 60 or so cores on a card) but the Phi cores are also cut down, limited purpose cores.

You couldn't just bolt a hundred full size Haswell cores together and run them at full speed because they're big, complex cores. And they'd pull more power than your entire house.

Quality vs. Quantity.
 

suszterpatt

Senior member
Jun 17, 2005
927
1
81
A bit more technical explanation: stream processors are very good at embarrassingly parallel processing, where you have a large amount of data, and you need to perform the same operations on each element. However, as different cores start branching to different execution paths, the parallel performance quickly drops. In CUDA, all cores operate in exact lockstep, always receiving the same instruction as all the others. If you only need to perform an instruction on a subset of cores, the other cores will still spend time executing that instruction, and then throw away the result. This makes them essentially useless for multitasking.
 

mfenn

Elite Member
Jan 17, 2010
22,400
5
71
www.mfenn.com
CUDA, all cores operate in exact lockstep, always receiving the same instruction as all the others.

A modern GPU isn't as bad as that when it comes to multiple instruction, multiple data (MIMD). The cores are organized into higher level groups of a few dozen cores (exact counts depend on the specific architecture) which can work on different instruction sets.

OP, in a way steam processors/CUDA cores are the opposite of HyperThreading. The purpose of HT is to take more advantage of your limited CPU cores by rapidly switching between multiple different instruction streams (threads) on the same hardware. GPUs do the opposite, execute a small number of threads on a lot of different hardware.