OILFIELDTRASH
Lifer
- May 13, 2009
- 12,333
- 612
- 126
Wait so BSN is only just now getting around to reporting the stuff you already posted about weeks ago!?
I assumed it was additional public benchmarking, not rehash of the same old benchmark run.
My bad
Do you know if this 48-core "Single-chip Cloud Computer (SCC)" is a Larrabee derivative or if it is a wholely different design/architecture?
http://news.cnet.com/8301-1001_3-10407818-92.html
I hope I'm not dumbing it down too much (simply due to my lack of experience and knowledge in this field), but if Larrabee is that far behind in ray-tracing capabilities, is Intel just trying to sell mini-super computers (that hopefully scale very well)? What exactly would be the benefit to consumers? Encoding/compiling boosts? Or are they trying to go for broke and capture the scientific sector as well (are there any reports of double-precision performance?).
The numbers we have been hearing for GT300/Fermi for the past few months were 768 GigaFLOPS for double-precision floating-point and 1.5 TeraFLOPS in single-precision floating-point
Nvidia said in its presentation, that is, 520 to 630 GigaFLOPS in double-precision floating-point. A quick trip to the calculator says 630/768 = 0.82, or nearly a 20% clock miss.
Larrabee should be a beast for GPGPU tasks, not so much on the graphics front. I think Intel was planning on doing Larrabee as a defensive measure against nV/ATi moving into that sector. Having a GPU market offset the costs associated with the R&D for such a part makes for a very interesting market. The issue I'm seeing with their hardware at this point is that they don't have remotely close to enough power to be competitive with ATi or nV's much older hardware, their entire chip is offering performance comparable to the shader hardware on the other parts ignoring that every bit of Larrabee's performance is in one giant pool and must handle all of the basic ops required for rasterization not to mention due to the nature of the chip even those simplistic operations are made considerably more complex as they need an emulation layer to have their part act as a rasterizer.
What it is looking like at this point in time, Larrabee will be comparable to Fermi in the GPGPU space and comparable to ~9400GT in the graphics space if they are lucky.
That's not better than Larrabee if the #1 quote is true, and slower if #2 is true. But since they are massively delayed, and the rule of delay(call it IntelUser's rule ) says it'll end up slower, #2 is more likely. It's not just Charlie's numbers, I have seen that number quoted somewhere else too.
Games are different story, but we know nothing there.
Still though, if you can sell me an x86 ISA-compatible processor with compiler support that hits 1TFlops for <$500 you will get my attention.
Due to the vector layout on Larrabee and the base P56 core it uses a pixel block write(as single pixel writes are going to be far too expensive) is going to be a nigh perfectly linear 1:1 take away from its' raw FLOPS performance. If we task Larrabee with a paltry 163MPixels it is down to 837GFLOPs, but those pixels need to have texture samples taken and blended, using the most simple operation anyone is going to remotely consider, bilinear, you are looking at 5 ops per pixel, oops, we don't have any ops left to handle geometry transformation, lighting or any shader effects at all. That is the reality that Larrabee is dealing with. 1TFLOP of shader power on top of a decent rasterizer is quite decent. 1TFLOP total power available to a GPU emulator? Catastrophic.
But isn't Larrabe a superscalar architecture?
Hey you got your wish.
