Intel Announces 48-Core Cascade-AP Multi-Chip Package

IEC

Elite Member
Super Moderator
Jun 10, 2004
14,330
4,917
136
Source:
https://www.anandtech.com/show/13535/intel-goes-for-48cores-cascade-ap

The good:
-Up to 48 cores per socket
-Aimed at 2S servers
-12 DDR4 channels
-Possible 5903 pin LGA socket
-Launch 1H2019?

The bad:
-14nm...
-UPI connection - no EMIB yet
-No mention of hyperthreading
-28-core XCC dies --> 24-cores "glued" together

The ugly:
-A current 24-core Xeon Platinum runs at 205W
-AMD launch of 7nm Epyc Rome imminent
-They disabled SMT on Epyc for performance comparisons
 

jpiniero

Lifer
Oct 1, 2010
14,591
5,214
136
Pretty sure AT is wrong on it being an LGA socket and that it is BGA. TDP is I'm guessing 300 at least.

The earlier rumors had the AP being a different die with 4 AVX-512 units per core but that doesn't seem to have happened.
 

thecoolnessrune

Diamond Member
Jun 8, 2005
9,672
578
126
I also remember reading in the past that this is likely to be a BGA Socket to allow for easier complex routing in the limited amount of space available.

As long as these are being targeted for the ultra-dense and high-end markets, 350, even 400 watt TDP's should not be that big of a deal as I would imagine BGA lends itself to "direct-to-node" liquid cooling like Lenovo's SD650 or HPE's Apollo f8000.
 

TheGiant

Senior member
Jun 12, 2017
748
353
106
well I wonder if that 12CH DDR4 can be tested with CFD calcs and other numerics which require lots of bandwitch
does it work like true 12CH or ?
 

DrMrLordX

Lifer
Apr 27, 2000
21,629
10,841
136
Why are they only 48-core and not 56-core chips? Is there some reason why 4 cores per die are disabled?
 
Mar 10, 2006
11,715
2,012
126
Pretty sure AT is wrong on it being an LGA socket and that it is BGA. TDP is I'm guessing 300 at least.

The earlier rumors had the AP being a different die with 4 AVX-512 units per core but that doesn't seem to have happened.

Why would AP be a different die?
 

rainy

Senior member
Jul 17, 2013
505
424
136
Why are they only 48-core and not 56-core chips? Is there some reason why 4 cores per die are disabled?

Most probably TDP - even with 48 cores it would be above 300W, with 56 cores that could be 350-400W.
 

jpiniero

Lifer
Oct 1, 2010
14,591
5,214
136
Why would AP be a different die?

Because it would have 3 additional AVX-512 units instead of just 1 since this was originally intended to be the Phi replacement. That was the rumor anyway, it may have not been true or cancelled.. and what we are getting instead is just two XCC dies glued together.
 
Mar 10, 2006
11,715
2,012
126
Because it would have 3 additional AVX-512 units instead of just 1 since this was originally intended to be the Phi replacement. That was the rumor anyway, it may have not been true or cancelled.. and what we are getting instead is just two XCC dies glued together.

I follow the rumors closely and never heard of such a thing.
 

jpiniero

Lifer
Oct 1, 2010
14,591
5,214
136
I follow the rumors closely and never heard of such a thing.

Think about it.. what exactly is the point of this versus a 4S XCC system? There's got to be more to it.

Should add that the rumor stated that the need to do something like this initially started was because Intel lost or scrapped a big HPC design win, because they were told that the Phi wasn't all that useful in getting any kind of throughput out of it.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Think about it.. what exactly is the point of this versus a 4S XCC system? There's got to be more to it.

You are making no sense. They are using an MCM design because it had to be done quick. It's also named Cascade Lake. If it added 2 more AVX-512 units they need to make up room for it. Do you think it comes for free?

Harder problem than finding extra space is the substantially increased TDP on top of having 48 cores. Vector units take 3/4 or more of the total power used. Double the part that takes 75%+ and tell me what that does to the whole?

Another problem combines the two problems together. What's the point of 75% higher TDP(making it at 450-500W part rather than 300W perhaps) if it doesn't perform well? Further doubling of AVX capabilities require things like cache system and load/store system to be substantially beefed up. That's a significant change in the floorplan.

As long as these are being targeted for the ultra-dense and high-end markets, 350, even 400 watt TDP's should not be that big of a deal as I would imagine BGA lends itself to "direct-to-node" liquid cooling like Lenovo's SD650 or HPE's Apollo f8000.

That makes sense as the -AP series become a replacement for Xeon Phi. I've heard of the 5900-pin BGA arrangement too. Most pin count increase will go into accommodating for the 6 extra channels of memory and rest for extra power.
 

Abwx

Lifer
Apr 2, 2011
10,947
3,457
136
In the comparison published by Intel the core/frequency product is 1.227x in favour of the Xeon.

They are implicitly admitting that for legacy FP perf up to AVX1 the Epyc set up is 17.6% faster/clock, that s all good to state 3.4x but it s 1/4 this value for the conditions i stated.

In the TRIAD test the perf/clock is close to even assuming perfect scaling from 64 to 96T, but even if it s not the case we can read that it is stated that the software is optimised for Intel CPU and that Epyc is thus not fully exploited in this comparison.

Edit : They state that for Triad they use AMD s numbers from June 2017, but how did they got the numbers with SMT disabled.?.

Or are the numbers with SMT enabled for both, because without HT the Xeon would be no match for an Epyc with SMT enabled...

4-1080.2269711568.png


https://www.computerbase.de/2018-11/intel-xeon-cascade-lake-ap-48-kerne-mcp/
 
Last edited:

krumme

Diamond Member
Oct 9, 2009
5,952
1,585
136
Think about it.. what exactly is the point of this versus a 4S XCC system? There's got to be more to it.

Should add that the rumor stated that the need to do something like this initially started was because Intel lost or scrapped a big HPC design win, because they were told that the Phi wasn't all that useful in getting any kind of throughput out of it.
This is just a product for 2s niches where area and avx512 is what matters.

Simple as that.
 
  • Like
Reactions: french toast

jpiniero

Lifer
Oct 1, 2010
14,591
5,214
136
You are making no sense. They are using an MCM design because it had to be done quick

That Intel was doing a dual die Xeon to replace the Phi had been rumored for some time. Cascade Lake-AP isn't a good replacement for the Phi though if it doesn't have the extra units. The real product may have been cancelled for one reason or another though, and this is what they hacked together to release something.
 

DrMrLordX

Lifer
Apr 27, 2000
21,629
10,841
136
Most probably TDP - even with 48 cores it would be above 300W, with 56 cores that could be 350-400W.

The sad thing is, anyone using a 4P system with 28-core Xeons, probably has a faster overall system than a Cascade-Lake AP 2P system.
 

Atari2600

Golden Member
Nov 22, 2016
1,409
1,655
136
I cannot even smell the reek of desperation because of the smell of burnt PSUs...
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
The real product may have been cancelled for one reason or another though, and this is what they hacked together to release something.

The real product would have been Cannonlake or Icelake.

The notion that they could stick 2x AVX-512 units as easily as putting scotch tape to cover up damage is ludicrous. We know even Cascade Lake is a product that resulted in the company not preparing for 10nm fallouts. Cascade Lake AP is a further, and much rushed result.

The sad thing is, anyone using a 4P system with 28-core Xeons, probably has a faster overall system than a Cascade-Lake AP 2P system.

It's likely for density. There's an article that was saying the volume has been shifting from 8 socket systems to 4 as core counts balloon. Cascade Lake AP isn't the best example, but future efforts from both companies will steer towards 1/2 socket systems with 2x cores.