Discussion Intel current and future Lakes & Rapids thread

Page 385 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

mikk

Diamond Member
May 15, 2012
4,133
2,136
136
About XE_LPD:

Upcoming platforms will be using an updated display architecture called "XE_LPD." Despite the new name, XE_LPD is a pretty natural evolution from the current design we've been using on TGL, RKL, DG1, and ADL-S.

The arrival of this new display architecture coincides with a general disaggregation of Intel GPUs' architecture version numbering for the different component IP blocks. Going forward it isn't accurate to talk about a platform using INTEL_GEN() anymore since the various IP blocks (graphics, media, display) are moving to independent internal numbering schemes that may have different granularity and move at different cadences; the hardware teams have asked us to start tracking these values separately for "graphics," "media," and "display" such that anywhere that we need to do a numerical comparison on the architecture version, we should need to use an IP-specific version number instead of INTEL_GEN().


This is a good decision, they are much more flexible if they can upgrade the various IP blocks independent from each other in the future.
 
Last edited:

jpiniero

Lifer
Oct 1, 2010
14,584
5,206
136
In summary, your argument has moved me I will admit that but I'm still not convinced that Intel has remained at 14nm this long solely due to technical issues,

I do think that Intel cancelled Ice Lake desktop when they did because of clock speeds. But in general it's pretty much been yields.
 

Asterox

Golden Member
May 15, 2012
1,026
1,775
136
If we look at Cinebench R20, "i5 Rocket overclocked to 4.8ghz" is still below stock R5 5600X Cinebench R20 score 4500.


 

eek2121

Platinum Member
Aug 2, 2005
2,930
4,025
136
I like your analysis but I think there are a couple of factors not considered in your argument.

Regarding your point about going to denser nodes, yes that is correct the primary reason to move to a denser node was to add to the transistor budget while keeping die size the same or reducing it. Clocks were not a major concern.

But I think we have to consider to additional facts. First, until the transition to 22nm the progression was relatively smooth. 14nm was the first time that Intel encountered technical process "resistance" that actually caused significant production delays. In non-technical terms the "tick" used to be easy.

Also, during those ticks Intel could have afforded a regression in fmax because AMD was so far behind, Apple had no or limited CPU design, and ARM wasn't a threat.

The tremendous issues Intel has been having with process for the last 5 years coupled with the fact that AMD has caught up and passed them I would argue has created a completely new dynamic forcing the requirement to remain at 14nm for that extra 10% clock speed increase. While Tiger Lake clocks do seem quite high we don't know if due to density issues (hotspots) they have just now worked out how to get 8 of these 10SF Willow Cove cores operating at high clocks in the nT mode?

In summary, your argument has moved me I will admit that but I'm still not convinced that Intel has remained at 14nm this long solely due to technical issues, I think part of it is due to AMD (and others) breathing down their necks forcing the need for the highest clocks possible to remain competitive. Also I think sufficient yield for the number of parts they need to chip is (was) also part of it.

I will continue to argue that yield was an issue prior to the introduction of 10SF, but capacity is their current issue.

Intel has to supply many times the amount of chips as AMD.

If yield were an issue, Intel wouldn’t be shipping Ice Lake SP.

That is the same reason we don’t yet have 8 core Tiger Lake. Why half your already limited production? A 4 core Tiger Lake performs competitively with 6 core Renoir.

I actually can’t wait to see how well the top 8 core part performs and more importantly, how much power it uses.

Outside of that, Intel was far too aggressive with node shrinks IMO and that is what got them in trouble (among other things).

TSMC has had little issue moving to smaller nodes.

IMO Intel should shoot for smaller, more iterative shrinks and move back to a tick tock cadence.
 

eek2121

Platinum Member
Aug 2, 2005
2,930
4,025
136
If we look at Cinebench R20, "i5 Rocket overclocked to 4.8ghz" is still below stock R5 5600X Cinebench R20 score 4500.



I get the feeling that Intel underestimated AMD.
 
  • Like
Reactions: Tlh97

coercitiv

Diamond Member
Jan 24, 2014
6,187
11,854
136
But I think we have to consider to additional facts. First, until the transition to 22nm the progression was relatively smooth. 14nm was the first time that Intel encountered technical process "resistance" that actually caused significant production delays. In non-technical terms the "tick" used to be easy.

Also, during those ticks Intel could have afforded a regression in fmax because AMD was so far behind, Apple had no or limited CPU design, and ARM wasn't a threat.
We look at AMD and think "Oh, Intel needs higher clocks to win", but that's completely reactionary. They need better architectures and better sustained execution to win. Clocks are just part of the equation, the final tune if you will.

AMD won market share despite having lower clocked parts. Apple is wining in benchmarks despite having lower clocked parts.

The tremendous issues Intel has been having with process for the last 5 years coupled with the fact that AMD has caught up and passed them I would argue has created a completely new dynamic forcing the requirement to remain at 14nm for that extra 10% clock speed increase.
In order for this to be true, in order for Intel to stay on 14nm due to performance reasons the following should be happening:
  • core count should not be limited mainly due to die size, yet here we are seeing core count regression and MT performance regression
  • power should be under control, yet here we are seeing Intel inflating power budget with every gen to stay competitive
  • clocks should be 10%+ higher, yet we're seeing less than 5% instead... and mind you this comes at the cost of lower sustained clocks when TDP limited (a 10nm 65W TDP 8-core part will easily beat the 14nm 65W TDP parts)
On top of this RKL-S is still looking strange in the inter-core latency department, if this turns out to be in any way related to 14nm then we can add another major point to the list above.

Clocks are not the problem, it has to do with a mix of yields (initially) and volumes (currently). They're not going to make 10nm desktop parts and risk losing mobile market share due to lower 10nm SoC supply.
 
Last edited:

eek2121

Platinum Member
Aug 2, 2005
2,930
4,025
136
We look at AMD and think "Oh, Intel needs higher clocks to win", but that's completely reactionary. They need better architectures and better sustained execution to win. Clocks are just part of the equation, the final tune if you will.

AMD won market share despite having lower clocked parts. Apple is wining in benchmarks despite having lower clocked parts.


In order for this to be true, in order for Intel to stay on 14nm due to performance reasons the following should be happening:
  • core count should not be limited mainly due to die size, yet here we are seeing core count regression and MT performance regression
  • power should be under control, yet here we are seeing Intel inflating power budget with every gen to stay competitive
  • clocks should be 10%+ higher, yet we're seeing less than 5% instead... and mind you this comes at the cost of lower sustained clocks when TDP limited (a 10nm 65W TDP 8-core part will easily beat the 14nm 65W TDP parts)
On top of this RKL-S is still looking strange in the inter-core latency department, if this turns out to be in any way related to 14nm then we can add another major point to the list above.

Clocks are not the problem, it has to do with a mix of yields (initially) and volumes (currently). They're not going to make 10nm desktop parts and risk losing mobile market share due to lower 10nm SoC supply.

If they had gone the chiplet route they probably could have released chips with more cores, but they are moving to 10nm so...
 
Last edited:

LightningZ71

Golden Member
Mar 10, 2017
1,627
1,898
136
That last sentence about not making 10nm desktop parts while risking their mobile market is key.

You have to look at what Intel was seeing WHEN THEY GREEN LIT ROCKET LAKE. That was over two years ago. 10sf hadn't proven itself yet. 10nm Ice Lake was proving to be problematic for clocks and yields. OG 10nm was worse. They had an expected timeline for their competition. Why would they NOT choose to find a way to make a more competitive part, power notwithstanding, on their current volume node when it is certain that their leading edge node is significantly at risk?
 
  • Like
Reactions: Tlh97

CakeMonster

Golden Member
Nov 22, 2012
1,389
496
136
Intel tweaked the architecture every year until Skylake. There were always changes and I loved reading about them even if they were small. Had they planned to continue doing that, and was it only 10nm failure that kept them from doing that? Or were they happy with Skylake and planned on slowing doing with design changes?
 

LightningZ71

Golden Member
Mar 10, 2017
1,627
1,898
136
If they had gone the chiplet route they probably could have released chips with more cores, but they are moving to 1nm so...

I'm not convinced this would be the case. Intel had only just recently been able to come up with a sellable 8 core 10sf part. Judging by Ice Lake's performance, I don't think they had the ability to make a competitive chipplet based design comparable to Ryzen 3k and 5k desktop.
 

LightningZ71

Golden Member
Mar 10, 2017
1,627
1,898
136
Intel tweaked the architecture every year until Skylake. There were always changes and I loved reading about them even if they were small. Had they planned to continue doing that, and was it only 10nm failure that kept them from doing that? Or were they happy with Skylake and planned on slowing doing with design changes?
That's when their competition hit a roadblock. No more competitive pressure and profit taking was to be had. On top of that, they were anticipating 10nm going more quickly as well. They had new core designs, but those were matched to 10nm.
 

RTX

Member
Nov 5, 2020
90
40
61
Is TSMC unable to place the chips closer together compared to Sapphire Rapids? Looks to be like 2mm vs 0.01mm difference between the die spacing. Will this cause a large enough performance difference between the two? Ryzen 5950X still has large gaps between the dies compared to those 4 dies.


 
Last edited:
  • Like
Reactions: Tlh97

LightningZ71

Golden Member
Mar 10, 2017
1,627
1,898
136
Has it even been needed up until now? That level of precision placement hasn't been warranted for their products up to this point. Keep in mind, Intel isn't using a separate IO die on their package. This is still a cluster of monolithic processor dies. If AMD was using that, they would probably use the tight groupings between the CCDs instead of what they have. The extra gaps for AMD also help them reduce the thermal density of the whole package, which is important with ever increasing density nodes.

I suspect that, once AMD moves to active interposer technology, they'll have tighter chip spacing on their server packages.
 
  • Like
Reactions: Tlh97

btarlinian

Junior Member
Jun 23, 2020
8
15
51
Is TSMC unable to place the chips closer together compared to Sapphire Rapids? Looks to be like 2mm vs 0.01mm difference between the die spacing. Will this cause a large enough performance difference between the two? Ryzen 5950X still has large gaps between the dies compared to those 4 dies.



Sapphire Rapids is almost definitely using EMIB to connect the CPU chiplets. (These are small pieces of silicon with high density interconnects placed underneath the edges of the top die.) This inherently requires the top die to be relatively close to each other in order to avoid having this bridge silicon be larger than necessary. AMD's chiplet approach is using standard in-package routing to connect the chiplets, which likely has larger power costs. It's possible that the higher density EMIB interconnect may allow Intel to span the standard within-die mesh across all four die, allowing them to avoid the need for in-socket NUMA even without a central die.
 

Hulk

Diamond Member
Oct 9, 1999
4,214
2,006
136
We look at AMD and think "Oh, Intel needs higher clocks to win", but that's completely reactionary. They need better architectures and better sustained execution to win. Clocks are just part of the equation, the final tune if you will.

AMD won market share despite having lower clocked parts. Apple is wining in benchmarks despite having lower clocked parts.


In order for this to be true, in order for Intel to stay on 14nm due to performance reasons the following should be happening:
  • core count should not be limited mainly due to die size, yet here we are seeing core count regression and MT performance regression
  • power should be under control, yet here we are seeing Intel inflating power budget with every gen to stay competitive
  • clocks should be 10%+ higher, yet we're seeing less than 5% instead... and mind you this comes at the cost of lower sustained clocks when TDP limited (a 10nm 65W TDP 8-core part will easily beat the 14nm 65W TDP parts)
On top of this RKL-S is still looking strange in the inter-core latency department, if this turns out to be in any way related to 14nm then we can add another major point to the list above.

Clocks are not the problem, it has to do with a mix of yields (initially) and volumes (currently). They're not going to make 10nm desktop parts and risk losing mobile market share due to lower 10nm SoC supply.

I would like to discuss this but I also want to say that I respect your knowledge and opinion and I am not trying to "win" a discussion only exchange thoughts.

"Intel needs higher clocks to win."
With the release of Zen 3 Intel had fallen behind with IPC. Really far behind. Skylake, a very good architecture was 5 years old and showing it's age. Preliminary reports from Sunny Cove (Rocket Lake) also seem to indicate that Intel is still behind Zen 3 clock-for-clock. What to do? Transition to 10SF, most likely lose a small amount of overall performance and fall even further behind in overall performance, or try to "hold as much ground" as possible by staying at 14nm? Seems like the smart thing to do would be to hold at 14nm. In addition, it is very possible that Rocket Lake on 14nm will maintain the gaming crown for Intel for the moment. Yes, it's a paper trophy but it's something to hold them over until the next architecture, which as you stated is what is really needed.

As for core count moving to 10SF wasn't going to allow them to compete with Zen 3. Maybe they could squeeze 12 cores into a monolithic die, maybe, but not 16. Meaning again they would lose the gaming crown (fmax) and overall performance metric to the 5950X.

I am under the assumption that Intel isn't worried about power on the desktop as much as being competitive with AMD. Again, they want that paper champion gaming title.

14nm is topped out with the Core architecture I think. Without a major core redesign, there is no more frequency left at 14+++++++++++++ without extreme cooling measure.

I think your last point is the strongest. 10SF yields and the fact that they may need all current 10SF for mobile parts, which may have been the overriding factor that forced Rocket Lake on 14nm. I admit that is probably what happened and the other "benefit" of holding the gaming title by virtue of the extra 500MHz was probably an added benefit despite the power issues.
 
  • Like
Reactions: Tlh97

cortexa99

Senior member
Jul 2, 2018
319
505
136
11600kf & 11400f ES tested

edit: likely not ES but QS...... 'ES' reported by CPU-Z might be a wrong recognition
 
Last edited:

mikk

Diamond Member
May 15, 2012
4,133
2,136
136
So Raptor Lake-P will support LPDDR5x according to the slide. I was searching for more infos:
JEDEC has extended the clock frequencies supported by its latest low power memory offering LPDDR5 to include the 937MHz and 1066MHz that translates to the max data rates of 7500MT/s and 8533 MT/s.

In summary the latest LPDDR max data rates:

LPDDR4-3200
LPDDR4x-4266
LPDDR5-6400
LPDDR5x-8533
 
  • Like
Reactions: Tlh97 and uzzi38

coercitiv

Diamond Member
Jan 24, 2014
6,187
11,854
136
With the release of Zen 3 Intel had fallen behind with IPC. Really far behind. Skylake, a very good architecture was 5 years old and showing it's age. Preliminary reports from Sunny Cove (Rocket Lake) also seem to indicate that Intel is still behind Zen 3 clock-for-clock. What to do? Transition to 10SF, most likely lose a small amount of overall performance and fall even further behind in overall performance, or try to "hold as much ground" as possible by staying at 14nm? Seems like the smart thing to do would be to hold at 14nm.
Let's take a look at what the rumor mill has on Tiger Lake H vs. Rocket Lake S, see if RKL-S is indeed the better option:

Core i9-11980HK
3.1Ghz base @ 65W, 4.4 all-core turbo, 5.0 max turbo

Core i9-11900
2.5Ghz base @ 65W, 4.6 all-core turbo, 5.1 max turbo

We're seeing ~25% higher all-core clocks at ISO power, ~5% lower all-core turbo, and ~2 lower max turbo. And that's comparing a die that was specifically developed for mobile vs. a die specifically developed for desktop. If you want ST performance then 10SF with a desktop oriented design would most likely manage to stay within 2-5% of what 14nm does. Meanwhile, in the productivity department the 10SF implementation could easily offer 15-25% increased throughput.

Do you honestly believe Intel gave up 15-25% MT performance for 2-3 FPS in games?

I am under the assumption that Intel isn't worried about power on the desktop as much as being competitive with AMD. Again, they want that paper champion gaming title.
Power is performance. Intel is worried about performance. Intel is therefore worried about power as well, only it does not show once they are forced to use an inferior node for their desktop products.

Let's wait for Intel to launch both RKL-S and TGL-H and revisit this subject with real silicon tests, see how fmax ends up with 10SF, see how TGL-H measures up in MT performance @ 65W.
 
  • Like
Reactions: lightmanek

yuri69

Senior member
Jul 16, 2013
387
616
136
Is TSMC unable to place the chips closer together compared to Sapphire Rapids? Looks to be like 2mm vs 0.01mm difference between the die spacing. Will this cause a large enough performance difference between the two? Ryzen 5950X still has large gaps between the dies compared to those 4 dies.


Like it was already said, Intel used an advanced packaging technology like EMIB or its further derivative. They got pin density and power advantage here. It will be interesting to see if SPR (and ALD?) can use this power advantage.

AMD is still stuck on the good 'ol MCM technology. The same approach was used for AMD Magny-Cours in 2010. There are rumors about a TSMC's technology being used for Trento and/or Genoa.
 
  • Like
Reactions: coercitiv

RTX

Member
Nov 5, 2020
90
40
61
Can this thing go to 190W at 4ghz to equalize with the 280W 64core at 3ghz? 28761 / 32845 * 4.6ghz = 4.02ghz with a 20% less voltage and 4% less from the heat?
290W / 1.20 / 1.20 / 1.04 ~ 190W or ( 4 / 4.6 )^3 ~ 190W

Let's take a look at what the rumor mill has on Tiger Lake H vs. Rocket Lake S, see if RKL-S is indeed the better option:

Core i9-11980HK
3.1Ghz base @ 65W, 4.4 all-core turbo, 5.0 max turbo

Core i9-11900
2.5Ghz base @ 65W, 4.6 all-core turbo, 5.1 max turbo

We're seeing ~25% higher all-core clocks at ISO power, ~5% lower all-core turbo, and ~2 lower max turbo. And that's comparing a die that was specifically developed for mobile vs. a die specifically developed for desktop. If you want ST performance then 10SF with a desktop oriented design would most likely manage to stay within 2-5% of what 14nm does. Meanwhile, in the productivity department the 10SF implementation could easily offer 15-25% increased throughput.

Do you honestly believe Intel gave up 15-25% MT performance for 2-3 FPS in games?

Power is performance. Intel is worried about performance. Intel is therefore worried about power as well, only it does not show once they are forced to use an inferior node for their desktop products.

Let's wait for Intel to launch both RKL-S and TGL-H and revisit this subject with real silicon tests, see how fmax ends up with 10SF, see how TGL-H measures up in MT performance @ 65W.
Isn't that for non AVX base clock for the i9-11900 @45W?
This one says 3.3Ghz base AVX512 @65W for the i9-11980HK cpu with a 2.6Ghz base AVX512 @45W

Don't we need to isolate the igpu from the 28W 1185G7 / 11375H chip and the i9-11980HK ( 32EU )?
i7-1185G7
3.0Ghz base @28W

i7-11375H
3.0Ghz base @28W with a 3.3Ghz base @35W

The 25W DG1 only has 80 EUs at 1.65ghz vs the 96 EUs at 1.35ghz. It should take roughly half the power of the i7-1185G7
If the 25W DG1 was scaled down to 1.35Ghz and increased the EUs to 96 from 80 with a 20% less voltage with 10% less from heat, it should roughly be ~15W from the integrated graphics, right? And 32 EUs ~5W?

Doubling up the quad core and reducing the EUs to 1/3 would be roughly 45W at 3.3ghz in non AVX?
 

eek2121

Platinum Member
Aug 2, 2005
2,930
4,025
136
Let's take a look at what the rumor mill has on Tiger Lake H vs. Rocket Lake S, see if RKL-S is indeed the better option:

Core i9-11980HK
3.1Ghz base @ 65W, 4.4 all-core turbo, 5.0 max turbo

Core i9-11900
2.5Ghz base @ 65W, 4.6 all-core turbo, 5.1 max turbo

We're seeing ~25% higher all-core clocks at ISO power, ~5% lower all-core turbo, and ~2 lower max turbo. And that's comparing a die that was specifically developed for mobile vs. a die specifically developed for desktop. If you want ST performance then 10SF with a desktop oriented design would most likely manage to stay within 2-5% of what 14nm does. Meanwhile, in the productivity department the 10SF implementation could easily offer 15-25% increased throughput.

Do you honestly believe Intel gave up 15-25% MT performance for 2-3 FPS in games?


Power is performance. Intel is worried about performance. Intel is therefore worried about power as well, only it does not show once they are forced to use an inferior node for their desktop products.

Let's wait for Intel to launch both RKL-S and TGL-H and revisit this subject with real silicon tests, see how fmax ends up with 10SF, see how TGL-H measures up in MT performance @ 65W.

Don’t forget that RKL is “65w” while TGL will likely be quite a bit closer to 65w because it is a mobile variant. That is why I want to get my hands on the 8-core part. It will be interesting to compare power usage vs. performance between the two chips.

EDIT: TGL looks like it scales quite nicely with higher power limits and TDPs: https://browser.geekbench.com/v5/cpu/6744745
 

eek2121

Platinum Member
Aug 2, 2005
2,930
4,025
136
Is this comment serious?

What is not serious about it? Ice Lake SP contains up to 32 cores. They’ve shipped more than 100,000 units according to AT. If they were having yield issues they wouldn’t be shipping large dies at all.
 

Hulk

Diamond Member
Oct 9, 1999
4,214
2,006
136
Let's take a look at what the rumor mill has on Tiger Lake H vs. Rocket Lake S, see if RKL-S is indeed the better option:

Core i9-11980HK
3.1Ghz base @ 65W, 4.4 all-core turbo, 5.0 max turbo

Core i9-11900
2.5Ghz base @ 65W, 4.6 all-core turbo, 5.1 max turbo

We're seeing ~25% higher all-core clocks at ISO power, ~5% lower all-core turbo, and ~2 lower max turbo. And that's comparing a die that was specifically developed for mobile vs. a die specifically developed for desktop. If you want ST performance then 10SF with a desktop oriented design would most likely manage to stay within 2-5% of what 14nm does. Meanwhile, in the productivity department the 10SF implementation could easily offer 15-25% increased throughput.

Do you honestly believe Intel gave up 15-25% MT performance for 2-3 FPS in games?


Power is performance. Intel is worried about performance. Intel is therefore worried about power as well, only it does not show once they are forced to use an inferior node for their desktop products.

Let's wait for Intel to launch both RKL-S and TGL-H and revisit this subject with real silicon tests, see how fmax ends up with 10SF, see how TGL-H measures up in MT performance @ 65W.

My initial claim/opinion was that if AMD was currently at Zen (original) level IPC and topping out at 14nm then Intel could have transitioned to 10nm for the desktop with 10SF, perhaps even the 10nm Ice Lake process. The reason is because they would have had a comfortable lead in both fmax and IPC.

But since AMD is breathing down their backs they held the desktop at 14nm for that last 5 or 10% in fmax, which directly translates into overall performance.

Based on your posts I will admit that most likely the largest factor keeping Intel at 14nm was yields at 10nm AND pressure from AMD. I think we can both agree that 10SF must first fulfill mobile orders and then desktop should the capacity and performance be there.