Alittle Haswell teaser

BenchPress · May 8, 2012

Intel17 said:
I mean, with AVX2 optimized software, I'm sure HSW will rock, just like SNB rips NHM to shreds with AVX optimized software, but I'd really like to see ISA-agnostic increases in performance.

You're looking at it all wrong. Why is the next generation of GPUs much faster the previous one, even though that requires using many new features the games don't know about? That's because the driver takes care of abstracting these brand new features and using them optimally...

The same is happening with AVX2. The vast majority of applications won't use it directly. Instead, they'll use an SPMD framework, like OpenCL. And that's not all. Any programming language can use the SPMD programming model on loops with independent iterations! What this means is that it will no longer be required to finely tune every individual application for the new instruction set extensions. Static or dynamic compilation with auto-vectorization takes care of acting like a 'driver' so the application developer doesn't even have to know about AVX2 to benefit from it.

AVX2 is radically revolutionary thanks to finally providing vector equivalents of every scalar instruction.

BenchPress · May 8, 2012

pelov said:
AVX2 might offer double the theoretical throughput of SSE4 but we're still stuck on low end SSE then what's the hope for AVX2 being leveraged in the first 1-2 years?

That's real easy: gain / cost ratio.

SSE4 only features a handful of instructions over SSE2. Taking advantage of them is hard, requiring manually written assembly code, and even in the best case the performance improvement is low. Plus it complicates QA and support. So obviously few developers have bothered using anything beyond SSE2.

Likewise, AVX1 is still severely handicapped. Even though it theoretically doubles floating-point performance, it lacks integer instructions and Intel didn't improve the cache bandwidth to sustain the higher throughput.

AVX2 fixes all of the shortcomings at once. It features 256-bit integer instructions, fused multiply-add, gather support, and vector-vector shift. And there's no doubt Haswell will have twice the cache bandwidth.

So the gains will be huge, while the cost for the developer will be low. Having vector equivalents of every scalar instruction means they can write scalar code and have it auto-vectorized by the compiler! Every major compiler is finalizing AVX2 support as we speak.

So developers a way more excited about AVX2 than any previous ISA extension. It's going to be adopted much more quickly.

pelov · May 9, 2012

Yet it still requires recompiling which developers have never been on time with. Sorry, man. You're putting wayyyyy too much faith in just a mere instruction set. Regardless of its theoretical throughput it has to play by the same games the others have played by and that means years before we see anything come out of it. Gaming is JUST hitting SSE instruction sets yet you seem convinced that AVX2 will catch on like wildfire? Hand me some of what you're smoking. Unless AVX2 is featured in the next-gen consoles then don't even bother giving it to me because I don't care and nor will the developers.

It's the same issue we face with threading. Unless it can be done passively without requiring extra work then it's just not going to happen. Recompiling for new instruction sets falls in the same boat. For AVX2 to catch on and be implemented wide scale shortly after its release would be the first time in the history of computing that an instruction set actually made an immediate difference.

Edrick · May 9, 2012

pelov said:
Unless AVX2 is featured in the next-gen consoles then don't even bother giving it to me because I don't care and nor will the developers.

FMA support is already in the PS3 (Cell CPU). 🙂 The main problem with the ports to PC is that the PC never supported it before (other than GPUs).

aaksheytalwar · May 9, 2012

If a sub $400 CPU gives me performance which bests a 5 GHz Ivy (due to better scalability etc, since beyond a point going from 4 to 5GHz in current CPUs the scaling is pretty bad due to arch limitations) then I am more than happy. If it can really give that performance with nearly perfect scaling, then that would be a true victory for Intel.

Like going from Pentium D to Core 2 Duo was, something of that sort. Or even going from Core 2 Quad to Core i7 was, along those lines.

Something that major with just a 2.5 Ghz CPU which also guarantees OC past 3.2GHz is going to be MAJOR, though I am just hoping and dreaming of course 😛

RU482 · May 12, 2012

Lonbjerg said:
We are allowed to discuss hardware...even if under NDA.
But it's not like "haswell" has beeen kept away form the public:
http://www.youtube.com/watch?v=5pKleSdXHT4
It's really not hard to evaluate "haswell"...

And there will be no need for "In Before the Lock".
I didn't sign a NDA...did you?

technically, yes, I signed (for my company) an NDA with Intel for the information they share on the Intel Business Link (IBL)
derp on my part, not associating IBL with In Before the Lock...lol

BenchPress · May 13, 2012

pelov said:
Yet it still requires recompiling which developers have never been on time with. Sorry, man. You're putting wayyyyy too much faith in just a mere instruction set. Regardless of its theoretical throughput it has to play by the same games the others have played by and that means years before we see anything come out of it. Gaming is JUST hitting SSE instruction sets yet you seem convinced that AVX2 will catch on like wildfire?

Once again you're making the mistake of basing things on past experience with instruction set extensions. What you fail to see is that AVX2 is nothing like the previous ones. SSE only helped explicit vector math, while AVX2 is applicable to any code loop with independent iterations. Do you even grasp what this means? It's a monumental paradigm shift for CPUs.

If you still think the adoption of AVX2 will be slow, you're going to have to come up with a whole set of different arguments, because it just isn't comparable to anything we had before. Otherwise please tell me which extension offered an eightfold speedup of scalar code with no application developer effort whatsoever.

AVX2 consists of the same technology that makes GPUs so fast at throughput computing. So everything points at a really fast adoption. In fact it already started a year ago, compilers are ready today, and we still have a year to go. So there's going to be plenty of AVX2 software on the day of Haswell's launch.

grimpr · May 13, 2012

When BOINC projects recompile to AVX2 and a Haswell hyperthreaded dualcore comes ontop of previous quads in throughput, then it will get my money.

T_Yamamoto · May 13, 2012

grimpr said:
When BOINC projects recompile to AVX2 and a Haswell hyperthreaded dualcore comes ontop of previous quads in throughput, then it will get my money.

Or maybe if the quad can oc like crazy.

aaksheytalwar · May 13, 2012

grimpr said:
When BOINC projects recompile to AVX2 and a Haswell hyperthreaded dualcore comes ontop of previous quads in throughput, then it will get my money.

IMO it is quite possible that Haswell dual cores will beat first gen I7s like 920 and 860, at least in gaming and some other stuff as which, which isn't that multithreaded, and will probably come pretty close for medium multithreaded stuff as well 😀

Alittle Haswell teaser

BenchPress

Senior member

BenchPress

Senior member

pelov

Diamond Member

Edrick

Golden Member

aaksheytalwar

Diamond Member

RU482

Lifer

BenchPress

Senior member

grimpr

Golden Member

T_Yamamoto

Lifer

aaksheytalwar

Diamond Member

TRENDING THREADS