ITT: We speculate on Skylake E5 and E7 Core counts, memory channels, SATA ports, etc

Page 4 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
With 14nm being a 1.5 node jump from 22nm ( 16nm would have been the classic 1 node jump from 22nm) and Broadwell's core size a mere 6.9 mm2, I really don't think 40C Skylake would be tough for Intel to do.

Buddy, they won't do it, simple as that. They stayed like http://forums.anandtech.com/showpost.php?p=37403073&postcount=31 for a reason.

With Haswell EX, Cores + 2.5MB L3 takes up 23.3mm2, and 18 of them should be 419mm2, but Haswell EX is 58% larger at 664mm2.

If you take your assumption in post #22 and Skylake stays EXACTLY the size of Broadwell cores and actual die size is 1.5x the combined Core + L3 we end up 698mm2. With 1.58x like Haswell EX it would be 735mm2. If Skylake cores are merely 10% larger than Broadwell we end up at 760mm2! The fact is Westmere/Ivy Bridge/Haswell cores + L3 all ended up in the 20mm2 range while Broadwell is only 11.6mm2 with 2.5MB L3 suggests Skylake is going to get quite a bit bigger, like 25-30%. 30% larger cores means 818mm2 die.

Don't forget the Omni Path Interconnect, 2 extra memory channels, perhaps even another memory controller type to support the "persistent memory" and more PCI Express links, and we may be in the 850mm2+ range! Actually considering that 28 core Skylake EX on that slide is likely going to end up in the ~650mm2 range its quite likely a 40 core one would end up not too far from 1000mm2 die somone else predicted.

Has anyone made a CPU that big? Nope. Has Intel make anything on the 700mm2 size without being a niche part like Itanium and Xeon Phi? Nope. Actually, Intel has a die size record with Itanium "Tukwila at 698.75mm2.
 
Last edited:

cbn

Lifer
Mar 27, 2009
12,968
221
106
Buddy, they won't do it, simple as that.

With Haswell EX, Cores + 2.5MB L3 takes up 23.3mm2, and 18 of them should be 419mm2, but Haswell EX is 58% larger at 664mm2.

If you take your assumption in post #22 and Skylake stays EXACTLY the same as Broadwell cores and actual die size is 1.5x the combined Core + L3 we end up 698mm2. With 1.58x like Haswell EX it would be 735mm2. If Skylake cores are merely 10% larger than Broadwell we end up at 760mm2!

I don't think the uncore (minus L3 cache) will scale the same 1.58 * core + L3 on a Skylake 40C. (re: cores are more than doubling, but the uncore minus cache is not going to more than double)

For example, on the 40C Skylake my prediction was 60 PCIe lanes, not 90 PCIe lanes (40/18 * 40 = 89 PCIe lanes, rounded up = 90 PCIe lanes)

Don't forget the Omni Path Interconnect, 2 extra memory channels, perhaps even another memory controller type to support the "persistent memory" and more PCI Express links, and we may be in the 850mm2+ range!

The extra memory channels (beyond four) and extra PCIe links should be part of your first xtor budget when you mention multiplying cores and cache by a factor 1.5 or 1.58 (for uncore minus cache).
 
Last edited:

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
I don't think the uncore (minus L3 cache) will scale the same 1.58 * core + L3 on a Skylake 40C. (re: cores are more than doubling, but the uncore minus cache is not going to more than double)

For example, on the 40C Skylake my prediction was 60 PCIe lanes, not 90 PCIe lanes (40/18 * 40 = 89 PCIe lanes, rounded up = 90 PCIe lanes)

It's good thing we know how many cores actual Skylake server parts will really have. Otherwise we'd be talking about this forever.

My post about the 4 generation EX chips should be a good clue on why they won't do that. Also maybe you'd like to tell me why a server chip, which has far more routing and uncore circuitry than PC chips AT Forums peeps talk about, would have less portion of it when the same generation is going to add more of those non-core components than ever.
 
Last edited:

cbn

Lifer
Mar 27, 2009
12,968
221
106
Actually considering that 28 core Skylake EX on that slide is likely going to end up in the ~650mm2 range its quite likely a 40 core one would end up not too far from 1000mm2 die somone else predicted.

650 mm2 for 28C or 1000 mm2 for 40C?

I think to get that big, Intel would have to increase L3 cache to more than 2.5 MB per core.

.....And who knows? Maybe Intel feels that cache is more important than extra cores at this moment?

It will be interesting to see where this 14nm 28C Skylake chip ends up as far as die size? And how much cache was used per core?
 
Last edited:

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
According to RWT, the Intel patents regarding memory suggests Purley is using a form of expanded caching system - the main memory is a DRAM cache to NAND Flash(SSD).

Probably needs a very robust system and hardware based, but not as exotic as we thought. Still if it works it might be a really good feature.
 
Last edited:

ShintaiDK

Lifer
Apr 22, 2012
20,378
146
106
With 14nm being a 1.5 node jump from 22nm ( 16nm would have been the classic 1 node jump from 22nm) and Broadwell's core size a mere 6.9 mm2, I really don't think 40C Skylake would be tough for Intel to do.

Some caveats (that I can remember) were:

1.) Skylake core size was substantially larger than the ~6.9 mm2 Broadwell core.

2.) Intel felt the need to substantially increase L3 cache on Skylake (for whatever reason). This would have meant less room for cores (for any given die size).

3.) Intel didn't want to lower clocks or raise TDP (but with IBM Power 8 already at 250 watts I don't see this as a true obstacle).

4.) Memory bandwidth (My guess was that eight channel could have handled 40C with reserve for additional cores on the 10nm Cannolake E7)

The only caverat is you keep forgetting the supporting infrastructure as well as 250W CPUs dont sell. Just because x company does it doesnt mean its a success or should be done.

It was obvious that 40C and 250W was so far away from reality and now we got the numbers. And everyone told you so.
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
The only caverat is you keep forgetting the supporting infrastructure as well as 250W CPUs dont sell. Just because x company does it doesnt mean its a success or should be done.

It was obvious that 40C and 250W was so far away from reality and now we got the numbers. And everyone told you so.

1.) You don't know how large this Skylake 28C chip will be. (Remember the largest server chip on 32nm, Westmere -EX, was only 513 mm2)

2.) Just because Intel could make some 250W variants out of 40C doesn't mean all the chips would have to be 200+ watts. In fact, using the same power envelope as Core M a 40C could drop to as low as 90 watts (I'm sure 145 to 165 watts would be more practical for such a chip though). Then factor due to functional and parametric yields not all 40C dice will yield 40C. Some of these will be 30C,32C,34C,36C,38C at various clockspeeds and TDPs.
 
Last edited:

tenks

Senior member
Apr 26, 2007
287
0
0
dude, let the 40c skylake pipedream die. it's officially confirmed to be 28C. why are you still talking about this?
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
dude, let the 40c skylake pipedream die. it's officially confirmed to be 28C. why are you still talking about this?

I think it is just as important to understand what Intel is capable of doing vs. what Intel actually ships.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
146
106
I think it is just as important to understand what Intel is capable of doing vs. what Intel actually ships.

I dont think your 40C dream is possible on 14nm. The important part is to be realistic.
 

tenks

Senior member
Apr 26, 2007
287
0
0
I think it is just as important to understand what Intel is capable of doing vs. what Intel actually ships.

Are you serious? That would require an entire forum dedicated to that topic alone. AMD hasn't been competitive on the CPU front for years. There are many, many times where Intel could have given us more than what they actually released. That is Intel in a nutshell. This is nothing new. In fact, that's technology in a nutshell. What we get vs what said companies are actually capable of.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
Are you serious? That would require an entire forum dedicated to that topic alone. AMD hasn't been competitive on the CPU front for years. There are many, many times where Intel could have given us more than what they actually released. That is Intel in a nutshell. This is nothing new. In fact, that's technology in a nutshell. What we get vs what said companies are actually capable of.

I think this is a bit different. The 40 core CPU he's proposing makes zero (a) financial sense (b) won't be performant (c) won't scale well over many sockets. Why would they make a reticle-limited die CPU just because?

Die size growth reaches an end just like TDP increase stopped few years ago. Who says 500W Desktop CPUs are impossible? All you need is bundled liquid nitrogen cooling with stores selling liquid nitrogen refill bottles. :p

There are many, many times where Intel could have given us more than what they actually released. That is Intel in a nutshell.
The fact that exotic cooling overclocking record is at 7-8GHz and air cooling limit is about 5GHz for the fastest CPU over already few process generations tell me they may indeed have genuine issues, and even though what you are saying is likely true, they are only increasing performance with the headroom between "what they could have released" versus "what they have released" getting smaller and smaller.
 
Last edited:

cbn

Lifer
Mar 27, 2009
12,968
221
106
I think this is a bit different. The 40 core CPU he's proposing makes zero (a) financial sense (b) won't be performant

40 cores @ 160 watts is pretty low power, only 4 watts per core. Low power per core Xeons fetch less money than higher power per core Xeons (compare the three 12 core Xeons below as an example):

http://ark.intel.com/products/81713/Intel-Xeon-Processor-E5-2690-v3-30M-Cache-2_60-GHz (11.25 watts per core)

http://ark.intel.com/products/81709/Intel-Xeon-Processor-E5-2670-v3-30M-Cache-2_30-GHz (10 watts per core)

http://ark.intel.com/products/81903/Intel-Xeon-Processor-E5-2650L-v3-30M-Cache-1_80-GHz (5.4 watts per core)

Lower money per core = lower incentive for Intel to produce.

However, this assumes Intel has no other competition.

If some competitor were to arise, Intel incentives to produce such chips would increase.

With that mentioned, a separate issue beyond power per core is density and therefore 40C@160 watts would have that as a advantage.

Another thing to think about is IBM Power at 250 watts and Intel with a 40C at the same TDP (although this is more of a niche case).

(c) won't scale well over many sockets.

Why do you think 40C would not scale well over many sockets?
 
Last edited:

cbn

Lifer
Mar 27, 2009
12,968
221
106
Regarding high core counts on 1P system, I thought the following was enlightening:

http://www.anandtech.com/print/8584...nd-e5-2650-v3-review-haswell-ep-with-10-cores

Intel Xeon E5-26xx v3 10 Cores Conclusion

Intel’s product stack for 2P capable CPUs is somewhat frustrating. The lower cost models always offer the best value for money, but getting a more expensive and faster CPU means that you end up with a faster unit. So if a user is buying purely on bang-for-buck, they might end up with a quad core.

The essence of the workstation is always centered on compute-limited throughput. I have mentioned this in a previous review – almost all computer usage can be split into idea-limited throughput or compute-limited. For the former, the user needs a faster brain, but for the latter a super-fast CPU is needed. Being able to get through a compute task even faster means the user is able to complete contracts quicker enabling more work and more money. Ultimately this means that if it can be justified in getting a higher core count processor, even at the expense of 100-200 MHz per thread, it might be worth investing in another $500.

In my previous existence requiring workstation CPUs, I was naïve and assumed that a 2P rig was the way to go – I even convinced my boss to invest in three for our simulation team. Our basic C++ simulations used threads, but no-one in the team understood about thread and cache management, let alone NUMA programming, because we were more chemists than computer scientists. I always encourage users to test their software on 1P and 2P workstations before convincing the people with the money to buy a machine – depending on the software, a big 1P system might have fewer cores but the cache management might increase throughput even more.

So besides not needing software supporting NUMA, a big 1P has advantages in cache management that can make up for having less total cores than two smaller processors in 2P configuration.

Where the line gets draw in various scenarios I'm sure is very interesting. This especially when consider how well clock speed vs. power consumption scales on these cores.
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
Regarding IBM Power, future versions will have NVLink which will allow Nvidia GPUs to have high bandwidth access to the CPU's memory.

At the same time Skylake Xeons will have AVX 512, so that helps.

Still I would think Intel would want to have something that breaks with tradition more to help compete with IBM (when specifically combined with Nvidia) as well as IBM's stand alone Blue Gene (high core count at low power consumption) processors.

In any event, it will be interesting to see where die sizes end up?

Is this going to be a generation of chips with large die size (eg, 45nm Xeons or 22nm Xeons)? Or are we going to see Intel go with a smaller die size like they did with 32nm Xeons?

P.S. I noticed on the top 10 supercomputers, three of them use Nvidia accelerators---> http://www.top500.org/lists/2014/11/ (One computer has AMD with Nvidia and the other two use Xeon with Nvidia). Four of the systems have IBM Blue Gene (16C @ 1.6 Ghz).
 
Last edited:

cbn

Lifer
Mar 27, 2009
12,968
221
106
Here is what the number 2 and number 3 supercomputers on that Top 10 list are moving on to:

http://www.nvidia.com/object/exascale-supercomputing.html

#2 (Oakridge) moves from AMD Opteron + Nvidia GPU to IBM Power (NVLink) and Nvidia GPU. (IBM Power uarch takes over x86 here)

#3 (Lawrence Livermore) moves from IBM Blue Gene (standalone high core count low power CPU built on 45nm) to IBM Power (NVLink) and Nvidia GPU.
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
Regarding the use of Xeon E5 with Xeon Phi, I like that idea a lot (of course). However, I think Intel should also be good enough that it is considered the best option for use with Nvidia (for customers that want to use Nvidia Tesla for whatever particular reason).
 
Last edited: