GPUs Will Have 20 Teraflops of Performance by 2015 ? Nvidia Chief Scientist.

Kakkoii

Senior member
Jun 5, 2009
379
0
0
http://www.xbitlabs.com/news/v...a_Chief_Scientist.html

I think this helps reinforce the rumor that Nvidia's GT300 series are going to use MIMD instead of SIMD. But as always, we'll have to wait and see.

Chief scientist of Nvidia Corp., William Dally, expects graphics processors to become more general purpose and get rather extreme performance in the coming years. In addition, Mr. Dally emphasized importance of further parallelism in graphics processors and implied that in future graphics processing units should transit to MIMD architecture.

The head of research at Nvidia said in an interview with Hardware.fr web-site that graphics chips will get 20TFLOPs performance in 2015, which is about twenty times higher compared to modern graphics processors, such as ATI Radeon HD 4890 or Nvidia GeForce GTX 285. In general, this means that every two years graphics processors will double their performance, a not really high pace, considering the fact that in the recent years GPUs doubled their floating point operations performance every year.

In addition, Mr. Dally expects future GPUs to be more effective at the level of parallelism: they will be able to execute multiple instructions at once asynchronously, a feature that is not available on today?s GPUs, but is supported by Intel's forthcoming Larrabee. In fact, there were reports that Nvidia?s next-generation graphics chips (NV60, GT300, etc) already feature MIMD (Multiple Instruction stream, Multiple Data stream) architecture.

Unfortunately, Mr. Dally, who joined Nvidia in January ?09, did not reveal much regarding Nvidia?s future plans or products, but again emphasized importance of CUDA (compute unified device architecture) programming model and (general-purpose computing on graphics processing units (GPGPU).
 

Idontcare

Elite Member
Oct 10, 1999
21,118
57
91
Sounds great. Now if someone could just get cost-viable 3D hologram display technology ported to the consumer segment we could get a little closer to those VR holodecks we all know we want.
 

poohbear

Platinum Member
Mar 11, 2003
2,284
5
81
i love threads that speculate about the future. I hope my vid card cooks for me and gives me free massages by 2015, but i wont bet on it.

in all honesty, if development continues at the current pace, i doubt that'll be true. We've had refreshes of the 8800gt for the past 2 years, and performance has not doubled at the 8800gt price point (when they were released they were $250, now a $250 graphics card isnt gonna give u double the performance). as the article states they used to double every year in performance.

Is MIMD a fancy way of saying GPUs will be multithreaded like current dual/quad core CPUs?
 

OCGuy

Lifer
Jul 12, 2000
27,227
36
91
Originally posted by: poohbear
i love threads that speculate about the future. I hope my vid card cooks for me and gives me free massages by 2015, but i wont bet on it.

in all honesty, if development continues at the current pace, i doubt that'll be true. We've had refreshes of the 8800gt for the past 2 years, and performance has not doubled at the 8800gt price point (when they were released they were $250, now a $250 graphics card isnt gonna give u double the performance). as the article states they used to double every year in performance.

Is MIMD a fancy way of saying GPUs will be multithreaded like current dual/quad core CPUs?


Well rumors have us going from ~1TFLOP ----> ~2 for this next gen......
 

DaveSimmons

Elite Member
Aug 12, 2001
40,730
670
126
6 years = 3-4 refreshes = 8-16 times the transistor count = 8-16 times the raw power for video cards since (unlike general CPUs) their tasks are massively parallel by nature.
 

imported_Shaq

Senior member
Sep 24, 2004
731
0
0
I don't know about this. It sounds like the 10Ghz goal that Intel had. Cut that number in half and you will be close to where it really will be....about 10. This fall a single GPU will be just over 2 according to rumors. I highly doubt there will be a 1000% increase in the next 6 years. Unless they are counting a GX2 as a single GPU then we will be close to 4 next year. However, I would still bet against 20.
 

Kakkoii

Senior member
Jun 5, 2009
379
0
0
Originally posted by: poohbear

Is MIMD a fancy way of saying GPUs will be multithreaded like current dual/quad core CPUs?


http://en.wikipedia.org/wiki/SIMD

http://en.wikipedia.org/wiki/MIMD

As Dally says in the article: "execute multiple instructions at once asynchronously".

This is something that CPU's can do, but it's not the exact same as multithreading. Threads and instructions are 2 different things, but I think multithreading enables multiple instructions also.
 

Wreckage

Banned
Jul 1, 2005
5,529
0
0
I'm starting to wonder if the GT300 will be another shocker like the G80 was. It sounds like MIMD could be the next Unified Shader tech, although ATI has been all but silent on their plans.
 

dguy6789

Diamond Member
Dec 9, 2002
8,558
3
76
Originally posted by: poohbear
i love threads that speculate about the future. I hope my vid card cooks for me and gives me free massages by 2015, but i wont bet on it.

in all honesty, if development continues at the current pace, i doubt that'll be true. We've had refreshes of the 8800gt for the past 2 years, and performance has not doubled at the 8800gt price point (when they were released they were $250, now a $250 graphics card isnt gonna give u double the performance). as the article states they used to double every year in performance.

Is MIMD a fancy way of saying GPUs will be multithreaded like current dual/quad core CPUs?

http://www.newegg.com/Product/...x?Item=N82E16814102809
 

Scali

Banned
Dec 3, 2004
2,495
0
0
Originally posted by: Wreckage
I'm starting to wonder if the GT300 will be another shocker like the G80 was. It sounds like MIMD could be the next Unified Shader tech, although ATI has been all but silent on their plans.

I wonder what their implementation will be like, exactly.
Currently, their GPUs already have multiple of what they call "multiprocessors" on their GPU. Each multiprocessor can perform an instruction on 32 threads in parallel (SIMD). This is what nVidia calls a "warp".

As far as I know, up to now all multiprocessors were executing the same threads (same code, different instances). This may have been a hardware limitation. Perhaps what they mean by MIMD is that now each multiprocessor can run its own thread, so you can combine multiple workloads more easily.

I can't really imagine them going completely the other direction, where there will be no SIMD at all, so each multiprocessor would be able to run 32 individual threads (with different code). I think that would require an incredible amount of extra logic and complexity. Not even Intel chose that route with Larrabee. Larrabee still has 16-wide SIMD units, which have to provide most of the 'grunt'. But each SIMD unit is part of an x86 core, which is similar to nVidia's multiprocessor-concept.
 

taltamir

Lifer
Mar 21, 2004
13,576
6
76
MIMD could dethrone x86 easily... since multiple cores on mimd scale 100% for a single thread without need for programming level parallelism... it is BOUND to happen, and the troubles with pushing process technology is a strong driving force (not to mention the patents and nvidia being cut out)

if nvidia can make it we could soon see a total blurring of lines between CPU and GPU.
 

Scali

Banned
Dec 3, 2004
2,495
0
0
Originally posted by: taltamir
MIMD could dethrone x86 easily... since multiple cores on mimd scale 100% for a single thread without need for programming level parallelism...

I don't see what you mean there.
There's always programming level paralellism. You have to design your code in a way that you can spawn hundreds of threads, else there's no point in even bothering to write GPGPU code in the first place. For this very reason, a large number of computational problems are never going to be candidates for GPGPU code in the first place.

A single thread on a GPGPU isn't very fast. A CPU is much faster at this sort of thing. I don't think this will ever change either... You can only get the level of single-threaded performance and flexibility of a CPU by making compromises to your parallelism.

What I DO see happening is the GPGPU replacing multicore x86 processors in many of the current multithreaded problems. In most cases, if you can speed up your problem by throwing more x86 cores at it, it can be sped up MORE by a GPGPU. The most obvious example is ofcourse 3d rendering itself, which hasn't been done on CPUs in over a decade, at least in games and related environments.
Then we have things like folding@home, video encoding, photoshop filters and such, which GPGPU is moving in on. Traditionally such tasks benefited greatly from multicore processors or multi-processor systems. Now they're being outperformed by GPUs, and the gap will likely only become larger over the years, as GPUs have consistently evolved at a much greater rate than CPUs. And judging by this statement from nVidia, the pace isn't going to drop anytime soon.
 

Kakkoii

Senior member
Jun 5, 2009
379
0
0
Originally posted by: Scali
Originally posted by: taltamir
MIMD could dethrone x86 easily... since multiple cores on mimd scale 100% for a single thread without need for programming level parallelism...

I don't see what you mean there.
There's always programming level paralellism. You have to design your code in a way that you can spawn hundreds of threads, else there's no point in even bothering to write GPGPU code in the first place. For this very reason, a large number of computational problems are never going to be candidates for GPGPU code in the first place.

A single thread on a GPGPU isn't very fast. A CPU is much faster at this sort of thing. I don't think this will ever change either... You can only get the level of single-threaded performance and flexibility of a CPU by making compromises to your parallelism.

What I DO see happening is the GPGPU replacing multicore x86 processors in many of the current multithreaded problems. In most cases, if you can speed up your problem by throwing more x86 cores at it, it can be sped up MORE by a GPGPU. The most obvious example is ofcourse 3d rendering itself, which hasn't been done on CPUs in over a decade, at least in games and related environments.
Then we have things like folding@home, video encoding, photoshop filters and such, which GPGPU is moving in on. Traditionally such tasks benefited greatly from multicore processors or multi-processor systems. Now they're being outperformed by GPUs, and the gap will likely only become larger over the years, as GPUs have consistently evolved at a much greater rate than CPUs. And judging by this statement from nVidia, the pace isn't going to drop anytime soon.

I think your confusing multithreading with instructions. Implementation of MIMD doesn't need to be programmed for, since all a MIMD configuration is doing, is allowing for more than 1 instruction at a time. You would automatically see a performance increase with existing games/code.
 

Scali

Banned
Dec 3, 2004
2,495
0
0
Originally posted by: Kakkoii
I think your confusing multithreading with instructions. Implementation of MIMD doesn't need to be programmed for, since all a MIMD configuration is doing, is allowing for more than 1 instruction at a time. You would automatically see a performance increase with existing games/code.

What do you mean by "more than 1 instruction at a time"? That could mean any number of things.
Are you perhaps confusing MIMD with what is known as a superscalar architecture, or perhaps even out-of-order execution? MIMD deals with thread-level parallelism where superscalar/oooe deal with instruction-level parallelism.

I stand by what I said, I don't see any confusion in my post. I do see confusion in yours, so I'll have to ask you to explain more clearly what you mean.
 

scooterlibby

Senior member
Feb 28, 2009
752
0
0
Guys guys, let's not fight. MIMD relies on instruction sets that are massively parallel, yet infinitely unitary. Think of a MIMD as a random walk with drift and time trend model, the command function kaleidoscopes DIMM and SATA units for in-order reprieves, demanding superscalar multithreading on GPGPU + SOI processes. It's similar to the Hessian matrix, except you take the first third derivative to maximize core output. Now, I am an industry insider, and you may be thinking "This is all BS, what do SATA and DIMM have to do with anything ever?" Well, I assure you that my inside knowledge may render the explanation useless, but take comfort in it's indisputable truisms.
 

Idontcare

Elite Member
Oct 10, 1999
21,118
57
91
Originally posted by: scooterlibby
Guys guys, let's not fight. MIMD relies on instruction sets that are massively parallel, yet infinitely unitary. Think of a MIMD as a random walk with drift and time trend model, the command function kaleidoscopes DIMM and SATA units for in-order reprieves, demanding superscalar multithreading on GPGPU + SOI processes. It's similar to the Hessian matrix, except you take the first third derivative to maximize core output. Now, I am an industry insider, and you may be thinking "This is all BS, what do SATA and DIMM have to do with anything ever?" Well, I assure you that my inside knowledge may render the explanation useless, but take comfort in it's indisputable truisms.

Based on insider info, I know this to be true firsthand. The value of the jacobian transformation cannot be understated when it comes to leveraging MIMD to optimal effect.
 

wrangler

Senior member
Nov 13, 1999
539
0
71
Originally posted by: Idontcare
Originally posted by: scooterlibby
Guys guys, let's not fight. MIMD relies on instruction sets that are massively parallel, yet infinitely unitary. Think of a MIMD as a random walk with drift and time trend model, the command function kaleidoscopes DIMM and SATA units for in-order reprieves, demanding superscalar multithreading on GPGPU + SOI processes. It's similar to the Hessian matrix, except you take the first third derivative to maximize core output. Now, I am an industry insider, and you may be thinking "This is all BS, what do SATA and DIMM have to do with anything ever?" Well, I assure you that my inside knowledge may render the explanation useless, but take comfort in it's indisputable truisms.

Based on insider info, I know this to be true firsthand. The value of the jacobian transformation cannot be understated when it comes to leveraging MIMD to optimal effect.

Haha



Funny stuff.
 

Kakkoii

Senior member
Jun 5, 2009
379
0
0
Originally posted by: Idontcare
Originally posted by: scooterlibby
Guys guys, let's not fight. MIMD relies on instruction sets that are massively parallel, yet infinitely unitary. Think of a MIMD as a random walk with drift and time trend model, the command function kaleidoscopes DIMM and SATA units for in-order reprieves, demanding superscalar multithreading on GPGPU + SOI processes. It's similar to the Hessian matrix, except you take the first third derivative to maximize core output. Now, I am an industry insider, and you may be thinking "This is all BS, what do SATA and DIMM have to do with anything ever?" Well, I assure you that my inside knowledge may render the explanation useless, but take comfort in it's indisputable truisms.

Based on insider info, I know this to be true firsthand. The value of the jacobian transformation cannot be understated when it comes to leveraging MIMD to optimal effect.

Think you could explain MIMD in a simplified/dumbed down way for the rest of us? :p
 

Fox5

Diamond Member
Jan 31, 2005
5,957
7
81
Originally posted by: Kakkoii
Originally posted by: Idontcare
Originally posted by: scooterlibby
Guys guys, let's not fight. MIMD relies on instruction sets that are massively parallel, yet infinitely unitary. Think of a MIMD as a random walk with drift and time trend model, the command function kaleidoscopes DIMM and SATA units for in-order reprieves, demanding superscalar multithreading on GPGPU + SOI processes. It's similar to the Hessian matrix, except you take the first third derivative to maximize core output. Now, I am an industry insider, and you may be thinking "This is all BS, what do SATA and DIMM have to do with anything ever?" Well, I assure you that my inside knowledge may render the explanation useless, but take comfort in it's indisputable truisms.

Based on insider info, I know this to be true firsthand. The value of the jacobian transformation cannot be understated when it comes to leveraging MIMD to optimal effect.

Think you could explain MIMD in a simplified/dumbed down way for the rest of us? :p

My guess is that MIMD applies multiple instructions simultaneously to multiple data packets. Like add, multiply, and move all at once.
 

Kakkoii

Senior member
Jun 5, 2009
379
0
0
Originally posted by: Fox5
Originally posted by: Kakkoii
Originally posted by: Idontcare
Originally posted by: scooterlibby
Guys guys, let's not fight. MIMD relies on instruction sets that are massively parallel, yet infinitely unitary. Think of a MIMD as a random walk with drift and time trend model, the command function kaleidoscopes DIMM and SATA units for in-order reprieves, demanding superscalar multithreading on GPGPU + SOI processes. It's similar to the Hessian matrix, except you take the first third derivative to maximize core output. Now, I am an industry insider, and you may be thinking "This is all BS, what do SATA and DIMM have to do with anything ever?" Well, I assure you that my inside knowledge may render the explanation useless, but take comfort in it's indisputable truisms.

Based on insider info, I know this to be true firsthand. The value of the jacobian transformation cannot be understated when it comes to leveraging MIMD to optimal effect.

Think you could explain MIMD in a simplified/dumbed down way for the rest of us? :p

My guess is that MIMD applies multiple instructions simultaneously to multiple data packets. Like add, multiply, and move all at once.

Yeah I know that. That's pretty much what the abbreviation MIMD stands for. (Multiple Instruction stream, Multiple Data stream)

I was just asking for them to fully explain what they are talking about in a dumbed down way for the rest of the people in this thread, so it can be better understood why this is so good.
 

alcoholbob

Diamond Member
May 24, 2005
6,271
322
126
Considering it took nvidia about 3 years to go from 500GFLOPS to 900GFLOPS, assuming a hyperbolic curve, we'd be EXTREMELY generous to say 8TFLOPS by 2015. In fact if we stay on a linear path, 4TFLOPS for a single gpu core is more likely.
 

Wreckage

Banned
Jul 1, 2005
5,529
0
0
If they pull it off, this could give them huge increase in both gaming and GPGPU performance.
 

OCGuy

Lifer
Jul 12, 2000
27,227
36
91
Originally posted by: Astrallite
Considering it took nvidia about 3 years to go from 500GFLOPS to 900GFLOPS, assuming a hyperbolic curve, we'd be EXTREMELY generous to say 8TFLOPS by 2015. In fact if we stay on a linear path, 4TFLOPS for a single gpu core is more likely.

Apparently you missed my first post.
 

ASK THE COMMUNITY