News [Anand] Intel's Enterprise Extravaganza, Cascade Lake Launches

Page 4 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

moinmoin

Diamond Member
Jun 1, 2017
4,934
7,619
136
(...) as to whether Rome would support the full AVX-512 suite (it probably doesn't).
Considering how fragmented the full AVX-512 suite is (Gideon linked to a part of it, but look at that whole monster) there is indeed no way Zen 2/Rome would support it in complete. I just don't think AMD would leave the possibility of combining two 256-bit FMAs completely dormant in Zen 2 with the precedent they set in Zen 1.

Edit: Personally I'd hope AMD introduces an x64 counterpart to ARM's SVE instead jumping into the deep hole that is AVX-512.
 
Last edited:

moinmoin

Diamond Member
Jun 1, 2017
4,934
7,619
136
AMD just needs to support whatever Icelake-SP has and it'll get everything except a few, which were really meant for Xeon Phi lines.
Skylake SP had avx512f, avx512dq, avx512cd, avx512bw and avx512vl. According to the Wiki Cannon Lake introduced IFMA, VBMI, and Icelake is introducing VPOPCNTDQ, VNNI, VBMI2, BITALG, GFNI, VPCLMULQDQ, VAES... Not having to support Phi and Knights Mill at least saves us from ER, PF, 4VNNIW, 4FMAPS...
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Not having to support Phi and Knights Mill at least saves us from ER, PF, 4VNNIW, 4FMAPS...

VNNI on Icelake is pretty much its version of VNNIW on Knights Mill. FMAPS is an extension of single precision FP and to speed up deep learning.

The original Xeon Phi that came as a PCI Express card format supported its own version of 512-bit vector extensions. There was no problem back then.
 

JoeRambo

Golden Member
Jun 13, 2013
1,814
2,105
136
I just don't think AMD would leave the possibility of combining two 256-bit FMAs completely dormant in Zen 2 with the precedent they set in Zen 1.

WIth AVX2 it was easy, same instruction set on 128bit or 256bit vectors and very few instructions that shuffle things between halves of 128 etc. Zen does great job dealing with it, only a few mask moves are heavily microcoded.
With AVX512 it is massive undertaking, not only bits grow double, register numbers also double, making area growth for registers at least quadruple. There are also mask registers a well. And instructions are real pain this time, imho no easy teaming: permutes, shifts, masks, broadcasts, gathers make it is probably easier to build 512bit unit that can be split into separate 256bit halves, than it is to team two 256bit units and microcode a bunch of instructions with bad performance.
 
  • Like
Reactions: Arachnotronic

NTMBK

Lifer
Nov 14, 2011
10,208
4,940
136
AP8.jpg


Presented without comment
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Presented without comment

Someone on twitter mentioned that the size makes it look unreal. Haha. It looks like someone with bad photoshop skills made that picture.

Also, early indications seem like the -AP series are not going to be used by many vendors. Part of the reason is that basically Intel is providing nearly everything, so they are a pseudo-vendor for -AP platforms. Other than putting your company's sticker on the boxes, there's little way to differentiate. Maybe the bigger reason is that besides some really needing density for #1 linpack competitions, other platforms(including AMD's) are more attractive.

It may eventually sell more than Xeon Phi as successors are introduced, but Cascade Lake-AP doesn't seem to be it. Perhaps Icelake-AP with HBM2 memory?
 

NTMBK

Lifer
Nov 14, 2011
10,208
4,940
136
Also, early indications seem like the -AP series are not going to be used by many vendors. Part of the reason is that basically Intel is providing nearly everything, so they are a pseudo-vendor for -AP platforms. Other than putting your company's sticker on the boxes, there's little way to differentiate. Maybe the bigger reason is that besides some really needing density for #1 linpack competitions, other platforms(including AMD's) are more attractive.

It may eventually sell more than Xeon Phi as successors are introduced, but Cascade Lake-AP doesn't seem to be it. Perhaps Icelake-AP with HBM2 memory?

I think the customers who really need that kind of density are the kind of customers who would build their own servers anyway instead, instead of going out and buying something from Dell or SuperMicro. It's got to be some very specific niches who are willing to pay those sorts of prices just for a density premium.
 
  • Like
Reactions: Zucker2k

jpiniero

Lifer
Oct 1, 2010
14,510
5,159
136
It may eventually sell more than Xeon Phi as successors are introduced, but Cascade Lake-AP doesn't seem to be it.

I think the strangeness of the product kind of lends more credibility to the idea that the originally intended product was meant to be on 10 nm and was scrapped because of 10 nm.

Also have to figure the successor that was mentioned on that roadmap is Cooper Lake based, but I wonder if it might get cancelled if OEMs are that lukewarm to the AP line.
 

Yotsugi

Golden Member
Oct 16, 2017
1,029
487
106
I think the strangeness of the product kind of lends more credibility to the idea that the originally intended product was meant to be on 10 nm and was scrapped because of 10 nm.
Do you mean Knights Hill?
 

jpiniero

Lifer
Oct 1, 2010
14,510
5,159
136
Do you mean Knights Hill?

No, the rumor specifically was that it was going to be a custom (Icelake?) with 4 AVX-512 units. As it is, the raw flops isn't much better than the Phi it replaces, users much more power, and is far worse than the GPUs. Regardless of the legitimacy, you can see why Intel gave up and moved their efforts to using GPUs as well.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Regardless of the legitimacy, you can see why Intel gave up and moved their efforts to using GPUs as well.

A lot of things are useful with CPUs that have high FP performance. It's not like you have one type of code, with one type of user.
 

NTMBK

Lifer
Nov 14, 2011
10,208
4,940
136
A lot of things are useful with CPUs that have high FP performance. It's not like you have one type of code, with one type of user.

Shame that using that high FP performance tanks the clock speed, and hence performance of the rest of your code. Unless a really big portion of your time is spent crunching FP vectors, using AVX-512 can reduce overall performance.
 
  • Like
Reactions: ZGR

Ajay

Lifer
Jan 8, 2001
15,332
7,792
136
Shame that using that high FP performance tanks the clock speed, and hence performance of the rest of your code. Unless a really big portion of your time is spent crunching FP vectors, using AVX-512 can reduce overall performance.
Reminds me of the G5 Altivec problems.
 

TheGiant

Senior member
Jun 12, 2017
748
353
106
https://www.semiaccurate.com/2019/04/23/a-long-look-at-the-intel-cascade-lake-9200-line/

Ah Charlie. He's got an article on Cascade Lake-AP. Claims Intel cancelled it, but brought it back because of Rome. And yes OEMs aren't all that interested.
long live the intel hate....
performance wise, I won't be very concerced and it is a nice upgrade https://www.servethehome.com/2nd-gen-intel-xeon-scalable-launch-cascade-lake-details-and-analysis/, sometimes even intel is better price/performance which is really a signal intel must rething their pricing strategy
opteron times come again ! good news

I am very interested to see the ryzen/epyc 7nm power on different performance levels
 

DrMrLordX

Lifer
Apr 27, 2000
21,582
10,785
136
https://www.semiaccurate.com/2019/04/23/a-long-look-at-the-intel-cascade-lake-9200-line/

Ah Charlie. He's got an article on Cascade Lake-AP. Claims Intel cancelled it, but brought it back because of Rome. And yes OEMs aren't all that interested.

Weird article. It's like Charlie didn't notice that the -AP line is meant to be a Phi replacement? And Intel forgot that too? Are they seriously benching this thing against Rome? Why???

If anyone wanted to do an extensive article as to why Cascade Lake-AP is a poor HPC product (versus other options) then that would make sense. There's a reason why CL-AP isn't suitable for 99%+ of the market targeted by Rome.
 
  • Like
Reactions: Markfw

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,482
14,434
136
Weird article. It's like Charlie didn't notice that the -AP line is meant to be a Phi replacement? And Intel forgot that too? Are they seriously benching this thing against Rome? Why???

If anyone wanted to do an extensive article as to why Cascade Lake-AP is a poor HPC product (versus other options) then that would make sense. There's a reason why CL-AP isn't suitable for 99%+ of the market targeted by Rome.
What that says to me, is that Intel will use any method they can to look better than Rome. Benchmark Romes weakest area to this CPU's strength. and it looks better for Intel, even though we all know,(as you said) they are not targeting the same audience.
 
  • Like
Reactions: Drazick

jpiniero

Lifer
Oct 1, 2010
14,510
5,159
136
Weird article. It's like Charlie didn't notice that the -AP line is meant to be a Phi replacement?

I think what happened is that OEMs/HPC customers/etc didn't like Cascade Lake-AP either, and are focusing their efforts on the dGPUs... so Intel was going to cancel it. But because of Rome they revived it.
 

Yotsugi

Golden Member
Oct 16, 2017
1,029
487
106
It's like Charlie didn't notice that the -AP line is meant to be a Phi replacement?
Phi died for a reason.
CLAP is a density play, a fairly weak one at that since the likes of Cray can sell you ultra-dense 8p@1U using commodity CPUs, too.
 

DrMrLordX

Lifer
Apr 27, 2000
21,582
10,785
136
Phi died for a reason.

Relative to its date of release, the various Phi products were (in my opinion) better for their intended use than Cascade Lake-AP. Tianhe used Phi for a reason. That is all said and done, though.

I think what happened is that OEMs/HPC customers/etc didn't like Cascade Lake-AP either, and are focusing their efforts on the dGPUs... so Intel was going to cancel it. But because of Rome they revived it.

That seems possible, I guess? But it really is not a suitable replacement for Rome. It's an HPC product. Not a cloud server CPU, not a datacenter CPU.

Benchmark Romes weakest area to this CPU's strength. and it looks better for Intel, even though we all know,(as you said) they are not targeting the same audience.

Pretty much. Rome is a massive improvement in HPC over Naples, but it still has no AVX-512. So I could see Rome being suitable perhaps as a host for a massive dGPU deployment (as an alternative to, say, POWER + NV, such as what you see in Summit). But it is not suitable for HPC use on its own.
 

Yotsugi

Golden Member
Oct 16, 2017
1,029
487
106
Relative to its date of release, the various Phi products were (in my opinion) better for their intended use than Cascade Lake-AP
Intel has some absolutely unnerving ability to shove products down people's throats, yes.
but it still has no AVX-512
AVX-512 is no silver bullet, not even close, unless everything you're doing is running HPL.
But it is not suitable for HPC use on its own.
Rome is very, very much suitable for that, and you'll see plenty of CPU clusters running it.
 
Last edited:

DrMrLordX

Lifer
Apr 27, 2000
21,582
10,785
136
AVX-512 is no silver bullet, not even close, unless everything you're doing is running HPL.

The clockspeed hit stinks when you're trying to run mixed AVX-512 and non-AVX-512 loads.

Rome is very, very much suitable for that, and you'll see plenty of CPU clusters running it.

Hmm. Sadly, I probably won't see those clusters since nobody will show them to me. But then, that's been the case with the vast majority of them so no worries!