Intel Larrabee is capable of 2 TFLOPS

taltamir · Jul 7, 2008

the target audience for that is scientific processing. real life physics...
Almost every single one of those can be ran on an infinite amount of cores (as exemplified by distributed computing, which runs it on millions of seperate PCs).

So, it should rip tesla a new one. Well, if it is released shortly that is.

BenSkywalker · Jul 7, 2008

the target audience for that is scientific processing. real life physics...
Almost every single one of those can be ran on an infinite amount of cores (as exemplified by distributed computing, which runs it on millions of seperate PCs).

First off, keep in mind that it is still half of the theoretical performance of Tesla's highest offering atm. As far as the comparison to distributed computing- it is FAR easier to split a workload up to run on thousands of machines then it is to get thousands of threads running on one.

With the information we have seen on Intel's latest it will need to complete 20FLOPS per clock per core, that would mean they would need around 200GiB/sec just to handle the data flow(Tesla has more then this)- that is one of the hurdles that they need to face. Also, with 20 FLOPS being retired per core per clock that would indicate that the chip relies VERY heavily on SIMD- if the code does not lend itself to SIMD then the performance could suffer a fairly catastrophic fall off in performance(which is something that would not be uncommon in HPC applications) based on the information we have seen. Would Intel spend billions developing a chip that has such obvious and huge pitfalls associated with it? They already have.

Janooo · Jul 7, 2008

Originally posted by: BenSkywalker

the reason 2 Tflops is important is that it is almost double everything else on the market, the RV770 = 1.2Tflops, and the GTX280 is 0.9Tflop.

Click to expand...

RV770 is 0TFLOPS in anything resembling 754r compliance, this may change with their next generation part though(pointing that out as it removes them from a LOT of viable HPC applications).

The problem with 2TFLOPS theoretical peak is how are they going to extract that much paralelism with x86 code? They are moving away from OoO support with the architecture so even the existing library of code that can operate in such a manner on x86 architectures is likely going to require a recompile at the very least(more then likely some hand tweaking of the existing code first as a minimum).

This has always been Intel's problem with this market, and this seems to do nothing to help that situation in the least.

When was the 754R approved? Do you have some HPC application examples?

SunnyD · Jul 7, 2008

Originally posted by: Viditor

Originally posted by: AmberClad

Originally posted by: keysplayr2003
Situation? Ok, this is getting ridiculous. Something needs to be done here. You guys cannot keep doing this over and over again amidst every discussion we have.

OT: Intel and I go waaaaay back and are still my fav. Fearing that Intel would best Nvidia or ATI was not even crossing my mind. But you both certainly thought so without truly knowing.
I'll ask you to stop doing my thinking for me and just carry on the discussion normally, thanks in advance.

Click to expand...

If I misunderstood your thoughts as far as Larrabee, then my apologies.

I've gone on record here at AT as far as my feelings that it'd be a "minor miracle" if Larrabee turned out to be more than epic fail, given Intel's track record. But unlike some people, I wouldn't mind if Intel pulled it off. ATI manages to get it right once in a while, but they've proven to be...unreliable...as far as providing Nvidia with close competition on a consistent basis.

So alluding to what geoffry mentioned -- the more the merry. Even if it sucks, maybe it'll be a good enough of a value at the low end to put some pressure on. It just seems like some people around here are hoping for them to fail. Sorry, if I'm wrongly lumping you together with those people.

Click to expand...

Again, there will be no low end, mid end or other end...

Larrabee is to be for HPC computers and used as a CTM GPU only...
So to answer the inevitable questions, it will never even be able to play Crysis, let alone at good frame rates...

Edit: Let me try and explain better...
It's like everyone is a hot car enthusiast and is very excited about Intel's new very powerful type of engine.
Unfortunately, what most don't get is that the name of this new engine is Saturn V, and it doesn't really work very well in a car...

Sure, except a Saturn V is much much more optimized for things such as Ray Tracing rather than rasterizing...

Wait, Ray Tracing - isn't that the holy grail of high fidelity 3D graphics?

Wait again, Ray Tracing - isn't that was Intel is pushing and optimizing their architecture for currently? Real-time ray tracing?

Wait just one more time - Isn't ray tracing a graphics option?

Hmm... sorry Viditor, but while Larrabee may be a "HPC" part, it may just be "HPC" enough to push real-time ray tracing onto the main stream as a viable alternative to the current GPU. If we're seeing 8 cores push decent ray tracing frame rates at decent resolutions, what would 32 highly optimized cores do?

Couple that with it's "HPC" happiness, and you all of a sudden have a part that can tackle ray tracing and push physics (keeping in mind Intel owns Havok) to a degree that NVIDIA could only dream of.

Finally, with that much muscle behind Larrabee, I'm sure it could "emulate" a decent enough GPU to perform well for rasterization and 3D triangle setup engines. It may not be the fastest, but an R700 is no slouch. One generation old performance for "old" technology while providing top notch performance in new frontiers wouldn't be something I consider bad at all.

Besides, if you built a car around a Saturn V, it'd be one fucking fast car.

ajaidevsingh · Jul 7, 2008

Originally posted by: BFG10K
FLOPs is all and well and good but the question is:

How fast does it run current and past games?

What AA/AF modes does it bring to the table, how?s the IQ, and how fast do they run?

How?s the driver compatibility?

In order for me to move to Intel they would have to offer me something substantially better than current vendors do, and a FLOPs figure doesn?t tell us that.

One thing is for shure, more FLOPS does mean better ray tracing and radiosity not to mention the sweet reflections!!

ajaidevsingh · Jul 7, 2008

Originally posted by: SunnyD

Originally posted by: Viditor

Originally posted by: AmberClad

Originally posted by: keysplayr2003
Situation? Ok, this is getting ridiculous. Something needs to be done here. You guys cannot keep doing this over and over again amidst every discussion we have.

OT: Intel and I go waaaaay back and are still my fav. Fearing that Intel would best Nvidia or ATI was not even crossing my mind. But you both certainly thought so without truly knowing.
I'll ask you to stop doing my thinking for me and just carry on the discussion normally, thanks in advance.

Click to expand...

If I misunderstood your thoughts as far as Larrabee, then my apologies.

I've gone on record here at AT as far as my feelings that it'd be a "minor miracle" if Larrabee turned out to be more than epic fail, given Intel's track record. But unlike some people, I wouldn't mind if Intel pulled it off. ATI manages to get it right once in a while, but they've proven to be...unreliable...as far as providing Nvidia with close competition on a consistent basis.

So alluding to what geoffry mentioned -- the more the merry. Even if it sucks, maybe it'll be a good enough of a value at the low end to put some pressure on. It just seems like some people around here are hoping for them to fail. Sorry, if I'm wrongly lumping you together with those people.

Click to expand...

Again, there will be no low end, mid end or other end...

Larrabee is to be for HPC computers and used as a CTM GPU only...
So to answer the inevitable questions, it will never even be able to play Crysis, let alone at good frame rates...

Edit: Let me try and explain better...
It's like everyone is a hot car enthusiast and is very excited about Intel's new very powerful type of engine.
Unfortunately, what most don't get is that the name of this new engine is Saturn V, and it doesn't really work very well in a car...

Click to expand...

Sure, except a Saturn V is much much more optimized for things such as Ray Tracing rather than rasterizing...

Wait, Ray Tracing - isn't that the holy grail of high fidelity 3D graphics?

Wait again, Ray Tracing - isn't that was Intel is pushing and optimizing their architecture for currently? Real-time ray tracing?

Wait just one more time - Isn't ray tracing a graphics option?

Hmm... sorry Viditor, but while Larrabee may be a "HPC" part, it may just be "HPC" enough to push real-time ray tracing onto the main stream as a viable alternative to the current GPU. If we're seeing 8 cores push decent ray tracing frame rates at decent resolutions, what would 32 highly optimized cores do?

Couple that with it's "HPC" happiness, and you all of a sudden have a part that can tackle ray tracing and push physics (keeping in mind Intel owns Havok) to a degree that NVIDIA could only dream of.

Finally, with that much muscle behind Larrabee, I'm sure it could "emulate" a decent enough GPU to perform well for rasterization and 3D triangle setup engines. It may not be the fastest, but an R700 is no slouch. One generation old performance for "old" technology while providing top notch performance in new frontiers wouldn't be something I consider bad at all.

Besides, if you built a car around a Saturn V, it'd be one fucking fast car.

Wow, Intel is pushing for ray tracing but like all tech's, ray tracing is not perfect.. Intel still need to blend ray tracing and rasterizing....

I dont think Intel can beat Ati / Nvidia on rasterizing but with 2TFlop it can sure do soething about Ray tracing. Another thing R700 has been compared to this and termed as equal in terms of TFLOPS and there also is a catch "MICROSHUTTER"... Think a R700 with 2 cores have a problem about shareing a bus..

"Wait Larrabee is PCIE ohhh 32 CPU's shareing a 16X bus"

heyheybooboo · Jul 7, 2008

Originally posted by: taltamir
what is the ETA on it? if soon it could annihilate the competition.

With hype ?

heyheybooboo · Jul 7, 2008

Originally posted by: geoffry

Originally posted by: keysplayr2003

Who's talking about market share here? Straight up performance is what I think the center of discussion is here. Intel could sell a billion Larabee, but they could all suck just as bad as their IGP.

Click to expand...

While it may not be important to you for some reason, if they sell a decent amount of the intial cards even if it isn't a monster it would allow Intel to viably continue their development of future cards and become even more competitive in the long term.

I would think you would be happier to see a third firm enter the discrete GPU market.

Intel's discreet video started (and ended) with the 740 agp. The only thing it succeeded at was sucking. That doesn't mean that Intel will not try to drive the market in some fashion which I think is the basis for Larrabee, anyway.

SunnyD · Jul 7, 2008

Originally posted by: ajaidevsingh
Wow, Intel is pushing for ray tracing but like all tech's, ray tracing is not perfect.. Intel still need to blend ray tracing and rasterizing....

I dont think Intel can beat Ati / Nvidia on rasterizing but with 2TFlop it can sure do soething about Ray tracing. Another thing R700 has been compared to this and termed as equal in terms of TFLOPS and there also is a catch "MICROSHUTTER"... Think a R700 with 2 cores have a problem about shareing a bus..

"Wait Larrabee is PCIE ohhh 32 CPU's shareing a 16X bus"

Of course there are solutions to the whole ordeal there. Keep in mind I am betting that the 4870x2 sharing a memory pool will alleviate a lot of problems. Instead of each core loading and fetching independently on a limited bus, which makes sense for a CPU, for a GPU having a singular bus fetching mechanism to load the local memory and the GPU cores each feeding locally should alleviate a lot of pain.

SunnyD · Jul 7, 2008

Originally posted by: heyheybooboo

Originally posted by: geoffry

Originally posted by: keysplayr2003

Who's talking about market share here? Straight up performance is what I think the center of discussion is here. Intel could sell a billion Larabee, but they could all suck just as bad as their IGP.

Click to expand...

While it may not be important to you for some reason, if they sell a decent amount of the intial cards even if it isn't a monster it would allow Intel to viably continue their development of future cards and become even more competitive in the long term.

I would think you would be happier to see a third firm enter the discrete GPU market.

Click to expand...

Intel's discreet video started (and ended) with the 740 agp. The only thing it succeeded at was sucking. That doesn't mean that Intel will not try to drive the market in some fashion which I think is the basis for Larrabee, anyway.

This i740 was a fine GPU in it's day, however it was introduced late and was superseded quickly by next gen parts from ATI, NVIDIA and cheaper parts by S3 and Trident. At the time, Intel had nothing to gain as mainstream parts were much more profitable (IGP's), and there were too many competitors in the marketplace back then (Add in 3DLabs, Matrox, Cirrus Logic and all the other smaller players who have virtually no more presence other than niche markets).

taltamir · Jul 7, 2008

Originally posted by: BenSkywalker

the target audience for that is scientific processing. real life physics...
Almost every single one of those can be ran on an infinite amount of cores (as exemplified by distributed computing, which runs it on millions of seperate PCs).

Click to expand...

First off, keep in mind that it is still half of the theoretical performance of Tesla's highest offering atm. As far as the comparison to distributed computing- it is FAR easier to split a workload up to run on thousands of machines then it is to get thousands of threads running on one.

With the information we have seen on Intel's latest it will need to complete 20FLOPS per clock per core, that would mean they would need around 200GiB/sec just to handle the data flow(Tesla has more then this)- that is one of the hurdles that they need to face. Also, with 20 FLOPS being retired per core per clock that would indicate that the chip relies VERY heavily on SIMD- if the code does not lend itself to SIMD then the performance could suffer a fairly catastrophic fall off in performance(which is something that would not be uncommon in HPC applications) based on the information we have seen. Would Intel spend billions developing a chip that has such obvious and huge pitfalls associated with it? They already have.

this is twice the theoretical performance of one tesla board, an 8000$ tesla system has 4 boards in it.
But why can't intel fit 4 boards on one system as well? (well, aside from power consumption maybe)

nismotigerwvu · Jul 7, 2008

Originally posted by: Quiksilver

-Entirely made of x86 Pentium P54C cores
-Has 32 processing cores

Click to expand...

This is just silly. Why not use newer processors, ones that are more efficient, so you won't need so many cores to put out the same processing power and won't needed 300W of power draw and will be likely be cooling running as well.

I was thinking the same thing. If they were looking for a small, in order x86 core why not use atom-sourced technology instead of the original pentium (which is what the P54C is referring to correct?).

Fallen Kell · Jul 7, 2008

Originally posted by: Intelman07

Originally posted by: RussianSensation
1. You can't measure performance through TFlops alone as GTX 280 is faster than HD4870.
2. Even if intel can bring you the fastest graphics card, their driver support has no proven track record whatsoever.

Click to expand...

What does a TFlop mean then. If the HD 4870 has more, why shouldn't it be quicker. Is it drivers? Code tweaking?

Some Computer Science and Electrical and Computer Engineering 101 here...

TFlop = one trillion floating point operations per second.

Floating point operation = an arithmetic operation performed on floating point numbers

Floating point numbers = a number represented in floating point notation

Floating point notation = a radix numeration system in which the location of the decimal point is indicated by an exponent of the radix (or mantissa); in the floating point representation system, 0.0012 is represented as 0.12-2 where -2 is the exponent;

Computer systems and processors represent all decimal point numbers in a floating point notation. Sure you can try to put 2/3 as an integer (whole number), but it wouldn't be correct, since 2/3 is not 1.

Also, we are usually talking about the IEEE Floating point which is either a 32 bit or 64 bit (short real float, or long real float). In the IEEE 32bit standard, the first bit is the sign bit (either a 0 or a 1 for positive or negative respectively), the next 8 bits are actually an unsigned integer with a bias of 127, (in other words, if the value of the unsigned 8 bit is 3, it is actually means an exponent of 130 (127+3 = 130), and finally the last 23 bits is the normalized mantissa.

So for instance, 5.375 is stored as 01000000101011000000000000000000 in IEEE 32 bit floating format.

heyheybooboo · Jul 7, 2008

Originally posted by: SunnyD

Originally posted by: heyheybooboo

Originally posted by: geoffry

Originally posted by: keysplayr2003

Who's talking about market share here? Straight up performance is what I think the center of discussion is here. Intel could sell a billion Larabee, but they could all suck just as bad as their IGP.

Click to expand...

While it may not be important to you for some reason, if they sell a decent amount of the intial cards even if it isn't a monster it would allow Intel to viably continue their development of future cards and become even more competitive in the long term.

I would think you would be happier to see a third firm enter the discrete GPU market.

Click to expand...

Intel's discreet video started (and ended) with the 740 agp. The only thing it succeeded at was sucking. That doesn't mean that Intel will not try to drive the market in some fashion which I think is the basis for Larrabee, anyway.

Click to expand...

This i740 was a fine GPU in it's day, however it was introduced late and was superseded quickly by next gen parts from ATI, NVIDIA and cheaper parts by S3 and Trident. At the time, Intel had nothing to gain as mainstream parts were much more profitable (IGP's), and there were too many competitors in the marketplace back then (Add in 3DLabs, Matrox, Cirrus Logic and all the other smaller players who have virtually no more presence other than niche markets).

I had a Diamond Stealth VLB card with half the ram and it kicked the Intel 740 agp all over the playground - and stomped it for good measure. The Diamond 'In-Control' tools were sweet ...

rgallant · Jul 7, 2008

A noob question seeing how there is a lot of what if's in this thread

-Nehalem running at 3.2 gigs using 4-8 of it's 8-16 cores controlling the musle of the Larrabee card [using it just as it would the system memory maybe]
using PCI- E 0.3 [x16 = 32GB} how would that compare to todays GPU @700-800mz
-as a total package

nismotigerwvu · Jul 7, 2008

rgallant,

You are making an apples to oranges comparison (I know you didn't mean to) which is pretty much impossible to give a definite answer for. Modern GPU's are very specific in what they are designed to do (far more generalized/capable than previous designs but still purpose built) and the intel cores you are referring to are all general purpose. So you can throw clock rate out the window (even more so than a more clock rate comparison between 2 GPU's or CPU's). If you held a gun to me, I would say the GPU. Try running software rendering on any software and compare framerates to anything using the GPU instead and you will see why specialized hardware will always have an advantage.

bunnyfubbles · Jul 7, 2008

also, Nehalem is going to debut with 4 cores with HT (or 8 threads, 4 physical cores, 4 logical) and is still quite a ways away from 8 core / 16 threads, at least outside of a multiprocessor system.

Phynaz · Aug 4, 2008

Originally posted by: Viditor

Originally posted by: AmberClad

Originally posted by: keysplayr2003
Situation? Ok, this is getting ridiculous. Something needs to be done here. You guys cannot keep doing this over and over again amidst every discussion we have.

OT: Intel and I go waaaaay back and are still my fav. Fearing that Intel would best Nvidia or ATI was not even crossing my mind. But you both certainly thought so without truly knowing.
I'll ask you to stop doing my thinking for me and just carry on the discussion normally, thanks in advance.

Click to expand...

If I misunderstood your thoughts as far as Larrabee, then my apologies.

I've gone on record here at AT as far as my feelings that it'd be a "minor miracle" if Larrabee turned out to be more than epic fail, given Intel's track record. But unlike some people, I wouldn't mind if Intel pulled it off. ATI manages to get it right once in a while, but they've proven to be...unreliable...as far as providing Nvidia with close competition on a consistent basis.

So alluding to what geoffry mentioned -- the more the merry. Even if it sucks, maybe it'll be a good enough of a value at the low end to put some pressure on. It just seems like some people around here are hoping for them to fail. Sorry, if I'm wrongly lumping you together with those people.

Click to expand...

Again, there will be no low end, mid end or other end...

Larrabee is to be for HPC computers and used as a CTM GPU only...
So to answer the inevitable questions, it will never even be able to play Crysis, let alone at good frame rates...

Edit: Let me try and explain better...
It's like everyone is a hot car enthusiast and is very excited about Intel's new very powerful type of engine.
Unfortunately, what most don't get is that the name of this new engine is Saturn V, and it doesn't really work very well in a car...

Discrete graphics cards using the Larrabee architecture will initially be aimed at the gaming market.

How's that foot taste?

Modelworks · Aug 4, 2008

You left out that it is tile based rendering. Which for the workstation market absolutely sucks

ajaidevsingh · Aug 4, 2008

Hey can't my 2x4850's pump 2TFLOPS. 2x4850 ~ R700 so that means i bought the right card's for the future!!

Intel Larrabee is capable of 2 TFLOPS

Lifer

Diamond Member

Golden Member

Belgian Waffler

Senior member

Senior member

Diamond Member

Diamond Member

Belgian Waffler

Belgian Waffler

Lifer

Golden Member

Diamond Member

Diamond Member

Golden Member

Golden Member

Lifer

Lifer

Lifer

Senior member