Mainstream Intel Core Processors will not support AVX 512 from Skylake Only Xeon

csbin · Mar 1, 2015

http://wccftech.com/mainstream-intel-core-processors-support-avx-512-skylake-xeon/

oh...my god

Arachnotronic · Mar 1, 2015

csbin said:
http://wccftech.com/mainstream-intel-core-processors-support-avx-512-skylake-xeon/

oh...my god

Very old news

Also, here's the actual source: http://www.bitsandchips.it/hardware...olo-per-gli-xeon-skylake-e-non-i-core-skylake

ShintaiDK · Mar 1, 2015

Indeed very old. Its actually known since july 2013. But we seem to have an influx of those. Specially if they can add some drama.

Wccftech is nothing but a clickbait site.

witeken · Mar 1, 2015

Yeah, but I wonder if or what will be different about SKL compared to HSW to be excited about.

Ofcourse when I say mainstream processors wont have AVX 512, what I really mean is that it will be disabled. They will only be enabled on the Skylake SKUs on the Xeon platform. So it looks like the new iteration of Intels offerings will not have any significant new instruction set, mostly all the old stuff. Yet we have word that it is going to be one hell of an architecture to look out for, because for the first time in many years, Intel is simply refusing to divulge the slightest (information even under NDA). This above top secret attitude seems out of place since the process was already introduced with Broadwell and is supposed to be just a Haswell-equivalent for Broadwell. Something that definitely appears to not be the case.
The level of secrecy Intel is maintaining makes it very clear that they are bringing something brand new with the Skylake uarch.

Ken g6 · Mar 1, 2015

If AVX 512 will only be on Xeon, my next question is whether there will by Skylake Xeons that will fit in standard consumer mobos that will support it? Like the Xeon E3 series?

mikk · Mar 1, 2015

Yes this is known since many many months. Nothing new there. The only shock is that wccftech didn't know it, it just proves that they are noobish.

Ken g6 · Mar 1, 2015

Google answered my question, including [thread=2405871]in these forums[/thread] and in another wccftech article.

Xeon E3-1200 v5 will be based on the LGA 1151 socket and feature GT2 graphics with a RAM limit of 64GB (both DDR3 and DDR4 are supported). This is twice as much as its broadwell counterpart.

mikk · Mar 1, 2015

Highend Xeon only afaik. I don't expect AVX512 in Xeon E3 either.

ShintaiDK · Mar 1, 2015

With some luck it may be in the LGA20xx versions for consumers.

Enigmoid · Mar 1, 2015

Not surprising.

Zodiark1593 · Mar 1, 2015

Is there anything that even uses avx?

ShintaiDK · Mar 1, 2015

Zodiark1593 said:
Is there anything that even uses avx?

Yes, AVX and AVX2 is widely used.

At least 2 games even got dedicated AVX executables that differs featurewise from regular. Grid 2 and Dirt Showdown.

III-V · Mar 1, 2015

witeken said:
Yeah, but I wonder if or what will be different about SKL compared to HSW to be excited about.

The level of secrecy Intel is maintaining makes it very clear that they are bringing something brand new with the Skylake uarch.

I don't think Skylake will be all that great, from what I've pieced together. If Geekbench is accurate, and cache sizes are staying the same, Skylake won't be that big of a deal for performance, at least compared to what it could be. We may have to wait until Icelake to get "modernized" cache sizes.

IBM and Apple have already moved to 64KB L1 caches. AMD had them quite some time ago, and I can't help but think that ditching them hurt their ST performance.

On the other hand, preliminary scores from Geekbench look favorable, particularly for MT. This may suggest that MorphCore actually did end up in Skylake, but I do not believe this to be the case. If MorphCore were implemented, and correctly reported, Skylake would be a 4 core, 32 thread device, not a 4 core, 8 thread device as reported by Geekbench. Again, this may just be a reporting error, but I think MT scores would be much higher. Another possibility is that the SKU on Geekbench is not a fully-enabled variant -- perhaps MorphCore is only enabled on i7s or Xeons -- just food for thought.

More likely, the boost in MT is a result of moving to a "tiled" architecture, where cores in groups of twos share their L2, and a result of revamping inter-SoC communication (2D mesh instead of ring bus, as reported by Knight's Landing rumors, and confirmed as "plausible" by David Kanter). Silvermont already does this, as do Bulldozer-variants and Bobcat/Jaguar/Puma.

I averaged together all of the scores reported by Geekbench, sans memory scores, and managed to get an average of 14% improvement for integer (both ST and MT), 9% for ST (INT + FP), and 22% for MT, comparing the Skylake core @ 2.6 GHz vs. Haswell @ 4.0 GHz.

There are an enormous number of caveats that apply though. We don't know what is enabled and disabled on the sample, we don't know what boost clocks the sample has, we don't know the TDP, the OS is not constant, BIOS is not constant, different motherboards, AES scores abnormally low on Skylake, Skylake will likely go through another stepping before it releases, Geekbench is not exactly applicable to desktop workloads (much better for tablet/smartphones, though)... the list goes on.

But if I had to guess, it'll be a bigger increase than Haswell was by a fair margin -- Haswell was about 10% better per-clock, Skylake will probably be about 10-15% ST, 15-30% MT.

I am still worried about 14 nm's performance at the higher end of the frequency spectrum, though. Intel's 14 nm has better subthreshold slopes, but significantly higher DIBL than their 22 nm process. They have better saturation currents at a given Ioff, but only at the 0.7 V they report, and I suspect at higher voltages, 14 nm will fall behind 22 nm, just as 22 nm fell behind 32 nm. But, according to Intel's 14 nm paper, 14 nm's dielectric is more resilient than 22 nm's and they have less variation -- it seems 14 nm can be overvolted higher than 22 nm does, which would be interesting if my interpretation is correct -- however it would need this extra voltage anyway, since it is less sensitive to voltage scaling as I pointed out with the higher DIBL values. I should probably ask Idontcare for his interpretation... I don't fully understand everything I'm looking at.

Arachnotronic · Mar 1, 2015

III-V said:
I don't think Skylake will be all that great, from what I've pieced together. If Geekbench is accurate, and cache sizes are staying the same, Skylake won't be that big of a deal for performance, at least compared to what it could be. We may have to wait until Icelake to get "modernized" cache sizes.

IBM and Apple have already moved to 64KB L1 caches. AMD had them quite some time ago, and I can't help but think that ditching them hurt their ST performance.

With caches, everything's a trade-off. A larger cache implies a higher latency cache, so it's not an automatic "win" to double the size of the cache.

frozentundra123456 · Mar 1, 2015

III-V said:
I don't think Skylake will be all that great, from what I've pieced together. If Geekbench is accurate, and cache sizes are staying the same, Skylake won't be that big of a deal for performance, at least compared to what it could be. We may have to wait until Icelake to get "modernized" cache sizes.

IBM and Apple have already moved to 64KB L1 caches. AMD had them quite some time ago, and I can't help but think that ditching them hurt their ST performance.

On the other hand, preliminary scores from Geekbench look favorable, particularly for MT. This may suggest that MorphCore actually did end up in Skylake, but I do not believe this to be the case. If MorphCore were implemented, and correctly reported, Skylake would be a 4 core, 32 thread device, not a 4 core, 8 thread device as reported by Geekbench. Again, this may just be a reporting error, but I think MT scores would be much higher. Another possibility is that the SKU on Geekbench is not a fully-enabled variant -- perhaps MorphCore is only enabled on i7s or Xeons -- just food for thought.

More likely, the boost in MT is a result of moving to a "tiled" architecture, where cores in groups of twos share their L2, and a result of revamping inter-SoC communication (2D mesh instead of ring bus, as reported by Knight's Landing rumors, and confirmed as "plausible" by David Kanter). Silvermont already does this, as do Bulldozer-variants and Bobcat/Jaguar/Puma.

I averaged together all of the scores reported by Geekbench, sans memory scores, and managed to get an average of 14% improvement for integer (both ST and MT), 9% for ST (INT + FP), and 22% for MT, comparing the Skylake core @ 2.6 GHz vs. Haswell @ 4.0 GHz.

There are an enormous number of caveats that apply though. We don't know what is enabled and disabled on the sample, we don't know what boost clocks the sample has, we don't know the TDP, the OS is not constant, BIOS is not constant, different motherboards, AES scores abnormally low on Skylake, Skylake will likely go through another stepping before it releases, Geekbench is not exactly applicable to desktop workloads (much better for tablet/smartphones, though)... the list goes on.

But if I had to guess, it'll be a bigger increase than Haswell was by a fair margin -- Haswell was about 10% better per-clock, Skylake will probably be about 10-15% ST, 15-30% MT.

I am still worried about 14 nm's performance at the higher end of the frequency spectrum, though. Intel's 14 nm has better subthreshold slopes, but significantly higher DIBL than their 22 nm process. They have better saturation currents at a given Ioff, but only at the 0.7 V they report, and I suspect at higher voltages, 14 nm will fall behind 22 nm, just as 22 nm fell behind 32 nm. But, according to Intel's 14 nm paper, 14 nm's dielectric is more resilient than 22 nm's and they have less variation -- it seems 14 nm can be overvolted higher than 22 nm does, which would be interesting if my interpretation is correct -- however it would need this extra voltage anyway, since it is less sensitive to voltage scaling as I pointed out with the higher DIBL values. I should probably ask Idontcare for his interpretation... I don't fully understand everything I'm looking at.

Taking somewhat intermediate values of your ranges, 12% ST and 20% MT, I think would be a very good improvement, since all the easy gains have been made already. I am talking about bottom line performance, combination of clockspeed and IPC. Hopefully we wont have a case like Kaveri where IPC gains were pretty much negated by lower clockspeeds, but I fear this could be a possibility for Skylake. I know some are going to argue that it is not needed, but *someday* Intel is going to have to break down and make a mainstream hex core if they want to keep increasing performance of anything but the igpu. I think it depends on Zen actually. If it is the great equalizer AMD fans are touting, perhaps it will motivate Intel to make 6 cores mainstream, or at least make hyperthreading more available. It it is mediocre or just OK, and has to compete on price only like their current lineup, then intel can continue as they are.

Roland00Address · Mar 1, 2015

Well I guess I am noobish for I did not hear about this till now

And this frustrates me to no end. Why does this frustrate me? Because intel is crippling what could be useful software tools before it becomes mainstream and thus make it harder for software designers to justify the work on adding it in marginal cases.

Sure most consumers do not use floating point heavily in their current software but how do you expect them to use such software when you cripple it from the beginning before such software is made.

ShintaiDK · Mar 1, 2015

Roland00Address said:
Well I guess I am noobish for I did not hear about this till now

And this frustrates me to no end. Why does this frustrate me? Because intel is crippling what could be useful software tools before it becomes mainstream and thus make it harder for software designers to justify the work on adding it in marginal cases.

Sure most consumers do not use floating point heavily in their current software but how do you expect them to use such software when you cripple it from the beginning before such software is made.

There are 2 sides of it.

Cripple and TDP.

An example is Haswell when running AVX2. We already know Xeons run AVX2 at a lower speed to compensate for the higher powerdraw. Shifting to AVX512 without a node shrink to compensate may not be the direction the 99% crowd wants. Aka lower TDP. Its great that Haswell is ~80% faster than IB when running AVX2, but there is also a downside with the powerdraw. I would first be dissapointed if Cannonlake with its shrink didnt introduce AVX512 to the mainstream.

In terms of the Celerons and Pentiums however, its nothing but crippling.

mikk · Mar 1, 2015

ShintaiDK said:
Yes, AVX and AVX2 is widely used.

At least 2 games even got dedicated AVX executables that differs featurewise from regular. Grid 2 and Dirt Showdown.

Widely used not really. And even if it is used in consumer software like x264 or Grid 2 AVX.exe the gain can be very tiny. Less than 5% in x264 because it is mostly non-SIMD assembly code. Same for Grid 2 AVX.

ShintaiDK · Mar 1, 2015

mikk said:
Widely used not really. And even if it is used in consumer software like x264 or Grid 2 AVX.exe the gain can be very tiny. Less than 5% in x264 because it is mostly non-SIMD assembly code. Same for Grid 2 AVX.

It doesnt matter if it only gives 0.1% or not. If its used its used.

Less than 5% for example is still more than 0%.

mikk · Mar 1, 2015

ShintaiDK said:
It doesnt matter if it only gives 0.1% or not. If its used its used.

As a useless checkbox feature - sure. AVX isn't widely used nevertheless.

SAAA · Mar 1, 2015

AVX512 disabled on mainstrem could be the reason why Haifa team wasn't very happy with the finalization of this tock... working hard to get that done and then having it cut out on most chips just for marketing reasons sounds so silly...
At least the users who would benefit the most are going to have it in Xeons but this really sounds like a trick to sell higher margin chips.

ThatBuzzkiller · Mar 1, 2015

I bet Skylake-S WILL support AVX-512 ...

III-V · Mar 1, 2015

Arachnotronic said:
With caches, everything's a trade-off. A larger cache implies a higher latency cache, so it's not an automatic "win" to double the size of the cache.

Yeah, but I'd think with modern workloads, we'd benefit from larger cache sizes. Anand wrote an article on Nehalem that highlighted that there are some on Intel's arch team that wanted a larger L3 on Nehalem, and Anand himself thought 256 KB on the L2 was too small. Given that the cache sizes have stayed put, but software has not over 6-7 years... I'd think that expanding the cache sizes would be of good use at this point.

Of course, I don't have any actual data on the matter -- I could be completely wrong, and "general workloads" may not have shifted to benefit more from larger cache sizes.

You can make larger caches that have low latency however -- IBM's Power8 has a 64KB L1D with a 3-cycle latency, compared to Intel's 32 KB L1D with a 4-cycle latency -- the tradeoff being that Power8's L1D consumes an inordinate amount of power.

SAAA · Mar 1, 2015

ThatBuzzkiller said:
I bet Skylake-S WILL support AVX-512 ...

But not Skylake-K of course. Dang this sounds so Intelish that they might pull it out.

What else they didn't implement before, TSX? Beside it being borked the feature was disabled in -k chips.

I imagine that after all the fuss when that extension was disabled they prefer to wait and see if everything works a bit longer this time, rather than have it used in mainstream only to backfire when say someone's cpu burns when running an heavy load with it.
You know twice the theoretical performance... wouldn't that make it a bit hotter than the already infernal AVX2? XD

kimmel · Mar 1, 2015

ThatBuzzkiller said:
I bet Skylake-S WILL support AVX-512 ...

If by skylake-s you mean skylake server then yes. If by s you mean socketed then nope.

Mainstream Intel Core Processors will not support AVX 512 from Skylake  Only Xeon

Senior member

Lifer

Lifer

Diamond Member

Programming Moderator, Elite Member

Diamond Member

Programming Moderator, Elite Member

Diamond Member

Lifer

Platinum Member

Platinum Member

Lifer

Senior member

Lifer

Lifer

Platinum Member

Lifer

Diamond Member

Lifer

Diamond Member

Senior member

Golden Member

Senior member

Senior member

Senior member

Mainstream Intel Core Processors will not support AVX 512 from Skylake Only Xeon