ITT: We speculate on Skylake E5 and E7 Core counts, memory channels, SATA ports, etc

cbn

Lifer
Mar 27, 2009
12,968
221
106
Currently both the E5 and E7 Xeons share the same processor die (18 core Haswell with 45MB cache being the largest die) and chipset die. The main difference (that I am aware of) is that the E5 scales to only two sockets per system, but E7 Xeon can scale out to four sockets or eight sockets per system.

So based on that what do you think will happen with the Skylake E5 and E7 processors? Will socket scalability stay the most prominent difference? Or do we see the Skylake E7 also get a different processor die with a greater number of cores, memory channels, PCIe lanes than the Skylake E5?

Likewise what about chipsets? Will the Skylake E7 get its own specific chipset with a greater number of SATA ports (etc.)? Or would that part still remained shared between Skylake E5 and E7?

Please feel free to make your core count, PCIe lane, memory channel, SATA port (as so on) predictions in this thread. My guess is that we will see the E7 diverge from the E5 with respect to core count, memory channel and PCIe count (However, the largest impact will happen at 10nm for the platform with the Cannolake E5 and E7 processors) Not sure about chipset, but it probably makes sense to change that as well.
 

R0H1T

Platinum Member
Jan 12, 2013
2,583
164
106
Currently both the E5 and E7 Xeons share the same processor die (18 core Haswell with 45MB cache being the largest die) and chipset die. The main difference (that I am aware of) is that the E5 scales to only two sockets per system, but E7 Xeon can scale out to four sockets or eight sockets per system.

So based on that what do you think will happen with the Skylake E5 and E7 processors? Will socket scalability stay the most prominent difference? Or do we see the Skylake E7 also get a different processor die with a greater number of cores, memory channels, PCIe lanes than the Skylake E5?

Likewise what about chipsets? Will the Skylake E7 get its own specific chipset with a greater number of SATA ports (etc.)? Or would that part still remained shared between Skylake E5 and E7?


Please feel free to make your core count, PCIe lane, memory channel, SATA port (as so on) predictions in this thread. My guess is that we will see the E7 diverge from the E5 with respect to core count, memory channel and PCIe count (However, the largest impact will happen at 10nm for the platform with the Cannolake E5 and E7 processors) Not sure about chipset, but it probably makes sense to change that as well.
That'll only happen when POWER9 or Zen beat Intel in the absolute performance metric i.e. only when Intel is forced to do so. At this point there is no need for it so they're not going to be making any new or separate die for the E7.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
146
106
Doesnt make sense to make a seperate die. Nomatter what.

SATA channels doesnt really matter on E7 either. And only somewhat on E5. And thats assuming you even use it at all instead of a RAID card.

PCIe lane count is pretty much irrelevant on multisocket systems. With 4 sockets you can get 128 PCIe 3.0 lanes with E7.
 
Last edited:

R0H1T

Platinum Member
Jan 12, 2013
2,583
164
106
Doesnt make sense to make a seperate die. Nomatter what.

SATA channels doesnt really matter on E7 either. And only somewhat on E5.

PCIe lane count is pretty much irrelevant on multisocket systems.
It does when the competition beats you in per socket & multi socket configs, not that I'm predicting they will but if they do Intel might have to go get a bigger die for the Xeon E7s.
 

R0H1T

Platinum Member
Jan 12, 2013
2,583
164
106
You cant make a much larger die than the 18 core.
20c or 24c on 10nm & anything lower than 18c, just for argument's sake, going towards the E5. I'm willing to bet Intel can stretch or challenge themselves into doing something like this if necessary & most definitely they should if the (absolute) performance crown is taken away from'em by POWER9 &/or Zen.
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
You cant make a much larger die than the 18 core.

On 14nm I'll bet Intel could get up to 40 cores at the same die size as the 662mm2 Haswell.

However, if Intel kept the base clocks 2.3GHz (as on Haswel Xeon E5 2699 v3, 145 watt TDP) or 2.5 Ghz (as on Haswell Xeon E7 8890 v3, 165 watt TDP) TDP for such a chip would definitely rise.
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
Just to add some info to this thread, here is the history of Intel x86 core count progression (top die size)

65nm: dual core (quad cores were two dual cores on package)
45nm: quad core
32nm: octocore
22nm: eighteen cores

Amazingly Intel has at least doubled core count* with every node and I don't expect to see this change with 14nm.

*32nm to 22nm transition was actually 125% increase in core count, rather than merely double (ie, 100% increase).
 
Last edited:

ShintaiDK

Lifer
Apr 22, 2012
20,378
146
106
Just to add some info to this thread, here is the history of Intel x86 core count progression (top die size)

65nm: dual core (quad cores were two dual cores on package)
45nm: quad core
32nm: octocore
22nm: eighteen cores

Amazingly Intel has at least doubled core count* with every node and I don't expect to see this change with 14nm.

*32nm to 22nm transition was actually 125% increase in core count, rather than merely double (ie, 100% increase).

65nm. 2x2 cores. 2x143mm2.
45nm 8 cores. 684mm2.
32nm 10 cores. 513mm2.
22nm 18 cores. 664mm2.
 
Last edited:

cbn

Lifer
Mar 27, 2009
12,968
221
106
For 14nm (if atom is any indication), I think we will see a really good density increase coming from 22nm:

1-1080.3935632552-9621d45d0d4dce0a.jpg


So I think 40+ Broadwell cores could be physically possible on a large 14nm die, however power consumption would be high (even at the same clocks as 18C Haswell E5/E7) because the power reduction will not match the increase in xtors. (eg, 100% increase in density for the same size die requires a 50% power reduction for the node otherwise TDP for the processor will increase. This assuming all other factors (cpu uarch, clocks, cache amount etc) are the same.)

With that mentioned, we need to factor in Skylake cores being larger than Broadwell cores (so that will hurt power consumption and reduce available die real estate a bit too). It also might be that Intel wants to add more than 2.5 MB L3 cache for each Skylake Core. This additional L3 cache would be another factor reducing the number of cores possible for any given die size.

Furthermore, maybe we don't see a -EX/Xeon E7 version for Skylake just like we didn't for Sandy Bridge?

All that existed for the Sandy Bridge multi socket Xeon was an octocore 435mm2 die. This was used for Sandy Bridge E5 Xeon. Though I don't expect this to happen for Skylake, if it did I think it would another factor reducing the core count potential.
 
Last edited:

cbn

Lifer
Mar 27, 2009
12,968
221
106
Here are my guesses for Skylake E5 Xeon:

12C/14C for the smallest die, 18C for the middle sized die, 26C for the largest die.
Quad channel memory
2.5 MB L3 cache per core
40 PCIe lanes
Lower clockspeeds
180 watt TDP


And my guess for Skylake E7 Xeon:

16C for the smallest die, 28C for the middle sized die, 40C for the largest die
Eight channel memory
2.5 MB L3 cache per core
60 PCIe lanes
Lower clockspeeds
250+ watt TDP
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
Or alternatively Intel could take those Skylake E7 Xeon dies and just disable memory controllers, PCIe lanes for the E5 Xeon and HEDT platform.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
146
106
Here are my guesses for Skylake E5 Xeon:

12C/14C for the smallest die, 18C for the middle sized die, 26C for the largest die.
Quad channel memory
2.5 MB L3 cache per core
40 PCIe lanes
Lower clockspeeds
180 watt TDP


And my guess for Skylake E7 Xeon:

16C for the smallest die, 28C for the middle sized die, 40C for the largest die
Eight channel memory
2.5 MB L3 cache per core
60 PCIe lanes
Lower clockspeeds
250+ watt TDP

250W+ TDP?
Quad E7 system would then have 240 PCIe lanes?
8 Channel memory, for a total of 32 memory channels?
40 cores with 100MB L3? A 1000mm2 die that cant be produced?

Really?
 
Last edited:

Enigmoid

Platinum Member
Sep 27, 2012
2,907
31
91
Here are my guesses for Skylake E5 Xeon:

12C/14C for the smallest die, 18C for the middle sized die, 26C for the largest die.
Quad channel memory
2.5 MB L3 cache per core
40 PCIe lanes
Lower clockspeeds
180 watt TDP


And my guess for Skylake E7 Xeon:

16C for the smallest die, 28C for the middle sized die, 40C for the largest die
Eight channel memory
2.5 MB L3 cache per core
60 PCIe lanes
Lower clockspeeds
250+ watt TDP

Doesn't make sense at all.

12C is too large for the smallest die (6 and 8 core too cut down too much). Quad channel memory limits the 26C die.

40C is far, far too large for the largest core die. >2x scaling won't happen on the 22 ->14 nm transition. I would guess more along the lines of 32-36 cores on the largest die at best but I honestly wouldn't be surprised with 28-30.

I doubt the need for that many PCI-e lanes.
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
12C is too large for the smallest die (6 and 8 core too cut down too much).

I expect Intel will use a higher clocked Skylake Xeon-D SoC for many future 6C and 8C applications.

But 12C is still do-able for 6C. (Intel already uses Haswell 8C dies for 4C Xeon E5 processors)
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
So for a die estimate on 40 Broadwell cores, I decided to add up the specs on five Xeon-D dies:

Each Xeon-D dies is 160mm2 on 14nm. This includes eight Broadwell cores, 12MB L3 Cache, 24 PCIe 3.0lanes, two channel DDR4 memory controller, two channel DDR3 memory controller, networking controller (two 10GbE), four usb 3.0.

So for 5 x 160mm2 (ie, five Xeon-D dies) we would have the following features at 800mm2 total die size:

40 Broadwell Cores
60 MB L3 cache
120 PCIe 3.0 lanes
10 channels worth of DDR4 memory controllers
10 channels worth of DDR3 memory controllers
20 usb 3.0
ten 10 GbE

For my hypothetical 40 core die we are still short 40MB L3 cache.

For 40MB L3 cache on 14nm, my estimate is that it would add 75mm2 die size. This based on Broadwell GT2 have 9% of it's 84mm2 die comprised of 4MB L3 cache. (9% of 84mm2 = 7.5 mm2 for 4MB L3. So 40 MB L3 = 75 mm2)

So total die size would be 875 mm2, but there is a lot of other stuff that needs to removed which would shrink this hypothetical 40 core Broadwell down to something smaller than 875 mm2. This includes 60 PCIe 3.0 lanes, two channels of DDR4 memory controller, 10 channels worth of DDR3 memory controller, ten 10GbE LAN, 20 usb 3.0.

With that mentioned, how much larger each Skylake core is compared to Broadwell is another thing to consider if we want to arrive at a final estimate for a hypothetical 40C Skylake die.
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
Alternatively looking at the isolated die size of the Broadwell core and it's L3 cache:

40 Broadwell cores without L3 cache works out to be 278 mm2 (re: a pair of Broadwell cores without L3 cache works out to be 17% of a 82mm2 GT2 die = 13.9 mm2 or 6.95 mm2 per core)

100 MB L3 cache on 14nm = 187 mm2 (see above post for math for Broadwell's L3 cache).

So 40 Broadwell cores (@ 6.95 mm2 per core) + 100 MB L3 cache = 465 mm2

With that mentioned, I do wonder how the Skylake cores will compare size wise on 14nm. This, in addition, to any difference in cache amount or density.
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
>2x scaling won't happen on the 22 ->14 nm transition.

According to my calculations (comparing die layouts of Haswell and Broadwell consumer dies), the Haswell core (without L3 cache) is 2.23 times larger than the Broadwell core (without L3 cache).
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
According to my calculations (comparing die layouts of Haswell and Broadwell consumer dies), the Haswell core (without L3 cache) is 2.23 times larger than the Broadwell core (without L3 cache).

You are right, but it doesn't scale like that for power usage.

My assumption:

Based on E5 models-

Sandy Bridge - 8 cores
Ivy Bridge - 12 cores
Haswell - 18 cores(I think this is an outlier, and should have been 16 cores. They probably flubbed on the perf/clock part and had to add 2 more cores)
Broadwell - 24 cores
Skylake - 32 cores?(At this point I would assume they have to bring something more revolutionary. Integrated FPGA? Specialized units? 1/2 Atom cores?)

So 40 Broadwell cores (@ 6.95 mm2 per core) + 100 MB L3 cache = 465 mm2
You have to add uncore space for that. If you base on this,


45nm 8 cores. 684mm2.

32nm 10 cores. 513mm2.
22nm 18 cores. 664mm2.
684mm2/2 x 1.25 cores = 427mm2, but ended up 513mm2
513mm2/2 x 1.8 cores = 462mm2, but ended up 664mm2

Take a look at this picture: http://www.anandtech.com/show/9193/the-xeon-e78800-v3-review
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
146
106
It is really not much power for 40C.



According the Anandtech Xeon E7 article, Power8 has twice the memory bandwidth:

http://www.anandtech.com/show/9193/the-xeon-e78800-v3-review/6

So eight channel memory for Skylake E7.



Why do you think it would be that big?

But its too much for what the customers want.

POWER8 is close to irrelevant. Look at their declining marketshare. OpenPower concept wasnt made for thier sudden willingness to try and share.

You cant produce 1000mm2 because its not possible with the current process and tools. The upper limit is around 800mm2 or something.