Knight's Landing, Skylake to unify instruction sets?

NTMBK

Lifer
Nov 14, 2011
10,419
5,712
136
Knight's Landing is listed as having AVX3.1, Skylake as having AVX3.2- does this mean we will see SSE & AVX support in the next Phi, and Phi instruction set support in Skylake?

Intel-Roadmap-Post-Haswell-Rumour.png
 

NTMBK

Lifer
Nov 14, 2011
10,419
5,712
136
Might just be the AVX instructions and not any legacy.

So long as it still has nice scatter/gather and vector masking, I don't mind. Full compatibility with their existing vector instructions for CPUs would be more important than compatibility with the single generation of Phis before it, in my eyes.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
So long as it still has nice scatter/gather and vector masking, I don't mind. Full compatibility with their existing vector instructions for CPUs would be more important than compatibility with the single generation of Phis before it, in my eyes.

One could also make the conclusion, that since they both go (the ability for) socket and DDR4 (no GDDR5?), then compability will be very high on the list.
 

cytg111

Lifer
Mar 17, 2008
25,710
15,188
136
Might even get exciting again .. and give moore some much needed CPR
 

inf64

Diamond Member
Mar 11, 2011
3,884
4,691
136
AVX3.2 is in Skylake and the core will be able to do 16Flops per core per cycle (2x Haswell). This is presumably with the FMA part of the ISA.
 

2is

Diamond Member
Apr 8, 2012
4,281
131
106
AVX3.2 is in Skylake and the core will be able to do 16Flops per core per cycle (2x Haswell). This is presumably with the FMA part of the ISA.

*Not available on K series processor

(if Intel's current business model is any indication)
 

NTMBK

Lifer
Nov 14, 2011
10,419
5,712
136
AVX3.2 is in Skylake and the core will be able to do 16Flops per core per cycle (2x Haswell). This is presumably with the FMA part of the ISA.

Double vector width then, not just executing a 512bt ISA on 256bt vectors? Yeesh- I wonder how the core counts and clock speeds of the Phi and Skylake will match up. Haswell EP is meant to have up to 15 cores, and will likely be close to 3GHz (if previous Xeon evidence is anything to go by)- not that far off Xeon Phi territory. Although the Phi does have 4 way SMT on its side.
 

jpiniero

Lifer
Oct 1, 2010
16,527
7,032
136
Would it really be practical for Intel to, say, remove the MMX/SSE/AVX units from the x86 core and throw some Phi cores on the die instead?
 

TuxDave

Lifer
Oct 8, 2002
10,571
3
71
Would it really be practical for Intel to, say, remove the MMX/SSE/AVX units from the x86 core and throw some Phi cores on the die instead?

At that point, you may as well throw away the entire core and stick with 100% Xeon Phi.
 

NTMBK

Lifer
Nov 14, 2011
10,419
5,712
136
Would it really be practical for Intel to, say, remove the MMX/SSE/AVX units from the x86 core and throw some Phi cores on the die instead?

Oh good gracious no. The #1 selling point of x86 (at least on Windows) is backwards compatibility. If you can't run MMX, SSE or AVX, the vast majority of applications from the last 10 years flat out won't run on your core. (Not to mention, AMD64 mandates a minimum of SSE2.)
 

jpiniero

Lifer
Oct 1, 2010
16,527
7,032
136
Oh good gracious no. The #1 selling point of x86 (at least on Windows) is backwards compatibility. If you can't run MMX, SSE or AVX, the vast majority of applications from the last 10 years flat out won't run on your core. (Not to mention, AMD64 mandates a minimum of SSE2.)

Call it "I can't believe it's not x86" then. Include software emulation to provide compatibility. "Force" people to upgrade their software to take advantage, and of course they would then buy new hardware. Lock AMD out :twisted:
 

NTMBK

Lifer
Nov 14, 2011
10,419
5,712
136
Call it "I can't believe it's not x86" then. Include software emulation to provide compatibility. "Force" people to upgrade their software to take advantage, and of course they would then buy new hardware. Lock AMD out :twisted:

The only thing which is even starting to go that way is x87 :p People bought into SSE to improve performance in apps that needed it, if Intel started tanking it intentionally they'd get a bit annoyed.

However, implementing SSE on the lower 128bts of a 512bt pipeline can't be that tricky, surely (I say naively)- they already do it for 128-on-256 with Haswell.
 

TuxDave

Lifer
Oct 8, 2002
10,571
3
71
The only thing which is even starting to go that way is x87 :p People bought into SSE to improve performance in apps that needed it, if Intel started tanking it intentionally they'd get a bit annoyed.

However, implementing SSE on the lower 128bts of a 512bt pipeline can't be that tricky, surely (I say naively)- they already do it for 128-on-256 with Haswell.

You're mostly right. It's not hard or expensive but there are some stupid mechanisms you still have to deal with to keep it architecturally correct.

http://software.intel.com/en-us/articles/avoiding-avx-sse-transition-penalties/

128-bit Intel® AVX instructions operate on the lower 128 bits of the YMM registers and zero the upper 128 bits. However, legacy Intel® SSE instructions operate on the XMM registers and have no knowledge of the upper 128 bits of the YMM registers. Because of this, the hardware saves the contents of the upper 128 bits of the YMM registers when transitioning from 256-bit Intel® AVX to legacy Intel® SSE, and then restores these values when transitioning back from Intel® SSE to Intel® AVX (256-bit or 128-bit).
 

Ayah

Platinum Member
Jan 1, 2006
2,512
1
81
I'm kinda curious how a socketed knight's landing would be priced..
 

NTMBK

Lifer
Nov 14, 2011
10,419
5,712
136
I'm kinda curious how a socketed knight's landing would be priced..

Hopefully not too much more than the regular socketed Xeon E5s- not including the cost of GDDR5 and the PCB should help a lot.
 

mrmt

Diamond Member
Aug 18, 2012
3,974
0
76
Hopefully not too much more than the regular socketed Xeon E5s- not including the cost of GDDR5 and the PCB should help a lot.

Currently Xeon Phi with 8GB GDDR5 has smaller list price than most top end EP Xeons.
 

sushiwarrior

Senior member
Mar 17, 2010
738
0
71
I wouldn't expect Xeon Phi to have many instruction sets. Adding just a single set means you have 50+ cores all adding the hardware for that set, which makes any additional sets extremely expensive to implement. In addition, Phi is about large numbers of SIMPLE cores - don't expect full featured cores, expect more cut down cores.
 

BenchPress

Senior member
Nov 8, 2011
392
0
0
Knight's Landing is listed as having AVX3.1, Skylake as having AVX3.2- does this mean we will see SSE & AVX support in the next Phi...
Not likely. Xeon Phi is targeted exclusively at the HPC market, and runs software by and for that market. So it doesn't have to be binary compatible with legacy CPU extensions.

You may not even want that. Xeon Phi is an in-order execution architecture with hundreds of threads, while desktop CPUs are out-of-order execution architectures with a modest number of threads. This requires a somewhat different programming approach. Code meant for one isn't going to run well on the other without at least recompiling. And if you have to recompile anyway, it might as well be binary incompatible to keep the hardware lean. Xeon Phi doesn't support unaligned vector operands, for starters. Adding support for that just to support smaller vector, makes very little sense.

It might just be a marketing decision to name them similarly. It stresses that CPUs can be equally useful for high throughput computing. It's just not their only focus, like it is with Xeon Phi.
...and Phi instruction set support in Skylake
That's a little more likely. AVX 3.2 suggests backward compatibility with Phi's AVX 3.1.

That would mean that AVX 3.2 is a significant departure from AVX2 and not just a widening of it. Phi has mask registers, for instance. It's arguable whether that's relevant. AVX was also a departure from SSE but the new encoding format supports all the old operations.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
I wonder if Broadwell will support AVX 3.1, or if Skylake simply jumps in with both 3.1 and 3.2.
 

NTMBK

Lifer
Nov 14, 2011
10,419
5,712
136
I wonder if Broadwell will support AVX 3.1, or if Skylake simply jumps in with both 3.1 and 3.2.

Probably not- Phi is a very, very long way away from AVX2 (as Benchpress rightly points out), so adding support to match Phi would be pretty far outside the usual scope of just shrinking Haswell.