Discussion Intel current and future Lakes & Rapids thread

Page 131 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

OriAr

Member
Feb 1, 2019
63
35
91
I mean I have said it before. Intel obviously didn't stop cpu uarch development just because of their process issues. A huge jump was to be expected due to what on the desktop will probably end up being half a decade of skylake-cores.

Obviously I expected a significantly big jump in IPC when ICL gets released because of that but I did not expect TGL (Or should I say Willow cove) to have an almost just as big increase in IPC! I thought Intel was gonna go back to their usual ~10% IPC increases they are used to.
 

mikk

Diamond Member
May 15, 2012
4,113
2,109
136
How can you know there is a big IPC increase? We don't know the exact clock speeds and also there are no final Icelake scores on this site to compare with.
 

Racan

Golden Member
Sep 22, 2012
1,101
1,969
136
I don't see Intel advertising IPC for Willow Cove:


CPU-Core-Roadmap.jpg
 

OriAr

Member
Feb 1, 2019
63
35
91
How can you know there is a big IPC increase? We don't know the exact clock speeds and also there are no final Icelake scores on this site to compare with.

There have been couple of benchmarks spotted on Userbenchmark that suggest TGL has as much performance @3.6Ghz as CFL @5Ghz ( https://www.userbenchmark.com/UserRun/19310543 ,https://cpu.userbenchmark.com/Intel-Core-i9-9900K/Rating/4028), pointing to an IPC increase of around 39% between SKL/CFL to TGL.
There is an average increase of 18% in IPC from CFL to ICL, meaning there is an increase of just below 18% increase in IPC from ICL to CFL.

Now, obviously the results need to be taken very carefully (Hell, the clock reading might not be even accurate), so it's really just a very early estimate.
 

mikk

Diamond Member
May 15, 2012
4,113
2,109
136
I don't see Intel advertising IPC for Willow Cove:


It doesn't mean there won't be any IPC gains though. Imho it just says there won't be major IPC gains, instead more like a tick improvement. But the fact that there is a cache redesign implies it isn't a simple refresh.

There have been couple of benchmarks spotted on Userbenchmark that suggest TGL has as much performance @3.6Ghz as CFL @5Ghz ( https://www.userbenchmark.com/UserRun/19310543 ,https://cpu.userbenchmark.com/Intel-Core-i9-9900K/Rating/4028), pointing to an IPC increase of around 39% between SKL/CFL to TGL.

There is an average increase of 18% in IPC from CFL to ICL, meaning there is an increase of just below 18% increase in IPC from ICL to CFL.


Now, obviously the results need to be taken very carefully (Hell, the clock reading might not be even accurate), so it's really just a very early estimate.


Average increase might be 18% but in some of them it can be higher, so we have no clue. Possibly the IPC increase from Icelake in userbenchmark is well above average, also the unkown 1C/2C Turbo (if there is) doesn't help. Of course the scores are insane for this early stage of TGL.
 

coercitiv

Diamond Member
Jan 24, 2014
6,151
11,686
136
I usually enjoy Gordon's chill approach, but this time I disagree with his POV. Normally a drop in clocks should be expected, in fact a drop in clocks may actually be good news as long as it comes as a result of a successful IPC <-> Power trade. However, such a big drop in clocks that it offsets almost all performance gains combined with such low yields that the product itself barely makes it to the market in significant quantities is very hard to swallow with a straight face.

If we take a look back at Broadwell and it's relatively short life, it did come with a small frequency drop, but overall MT performance saw a healthy increase because sustained clocks were higher. Notebookcheck reported a median CB15 MT score of 297 for i7 5600U while the previous gen 4600U scored near 250. That's a decent 15%+ performance increase in multithreaded workloads. Even so Intel could barely wait to replace Broadwell with Skylake parts as it brought 14nm to maturity.

Lower clocks aren't a disaster. Stagnating performance isn't a disaster. Low yields aren't a disaster. Equal efficiency even after a full node jump isn't a disaster. However all of the above happening *at once* are definitely a disaster, unless some think a "disaster" can only mean performance regression on a product that sees the light of day in token quantities. (Cannon Lake says hi)

The only good news about ICL are solid uarch improvements and GPU performance gain. It shows Intel has a solid base to work with while they figure out a way to catch those wrench throwing monkeys in their foundries. Hope we get more (accurate) TGL leaks soon.
 
Last edited:

Thala

Golden Member
Nov 12, 2014
1,355
653
136
Lower clocks aren't a disaster. Stagnating performance isn't a disaster. Low yields aren't a disaster. Equal efficiency even after a full node jump isn't a disaster. However all of the above happening *at once* are definitely a disaster, unless some think a "disaster" can only mean performance regression on a product that sees the light of day in token quantities. (Cannon Lake says hi)

I do disagree. I am totally fine with equal performance, but if this does not pay off as significant efficiency gain after a full node jump i would call it a disaster. Apparently the per clock performance uplift is eating into power (and most likely area as well) more than linear - extrapolating iso process here.
 
  • Like
Reactions: krumme

Asterox

Golden Member
May 15, 2012
1,026
1,775
136
Yes but the scores on this site are really good so early in development, it beats the best i7-8565U devices on userbenchmark. The ES samples from Icelake 1 year before launch were much slower clocked, look at this from 11 months ago: https://www.userbenchmark.com/UserRun/10940550

10nm is clearly in much better shape now. I still wonder if Tigerlake is made on 10nm++ or really on 10nm+ as Intel claimed.

Not for Desktop, 10nm yields are to low/very expensive+ fat CPU clock regresion.

 
  • Like
Reactions: Ottonomous

coercitiv

Diamond Member
Jan 24, 2014
6,151
11,686
136
I do disagree. I am totally fine with equal performance, but if this does not pay off as significant efficiency gain after a full node jump i would call it a disaster. Apparently the per clock performance uplift is eating into power (and most likely area as well) more than linear - extrapolating iso process here.
How can you disagree and agree at the same time? I specifically mentioned lower clocks and equal efficiency after a node jump not being a problem as long as they are not true at the same time.

PS: please, please no more AdoredTV.
 

Thala

Golden Member
Nov 12, 2014
1,355
653
136
How can you disagree and agree at the same time?

Maybe i was not clear enough - I am looking at efficiency independent of highest achievable frequency to conclude. So even higher achievable top frequency (and thus a performance increase) would not have changed my assertion.
 
Last edited:
  • Like
Reactions: krumme

jpiniero

Lifer
Oct 1, 2010
14,513
5,160
136
Even so Intel could barely wait to replace Broadwell with Skylake parts as it brought 14nm to maturity

Broadwell is exactly the blueprint for what I think Intel is doing wrt with Tigerlake, except it's being driven by the ability to use chiplets to partially mitigate the horrible yield.
 

DrMrLordX

Lifer
Apr 27, 2000
21,583
10,785
136
Not a chance. Even single digit % improvement will be the exception rather than the norm.

Tigerlake is Willow Cove, right? The Intel roadmap features in the video linked above by @Asterox shows Willow Cove having cache redesign and transistor optimizations but no marked IPC uplift over Sunny Cove.
 
  • Like
Reactions: krumme

dmens

Platinum Member
Mar 18, 2005
2,271
917
136
Tigerlake is Willow Cove, right? The Intel roadmap features in the video linked above by @Asterox shows Willow Cove having cache redesign and transistor optimizations but no marked IPC uplift over Sunny Cove.

I don't work at that wretched clown show any more so the alphabet soup doesn't mean anything to me now, but I have a pretty good guess what "cache redesign" means and it is not performance related LOL.

But yeah afaik the upcoming coves are all practically the same core IP so whatever improvement will come from the memory/IO subsystem as opposed to the core itself. Peeling off codenames does not mean any work was actually done.
 
Last edited:
  • Like
Reactions: krumme

jpiniero

Lifer
Oct 1, 2010
14,513
5,160
136
Looking at the Tigerlake ES again, I would say that the IPC difference between it and Skylake (not Icelake) is roughly 25%, and that's assuming the ACT was 3.6 with a SCT of maybe 3.8-3.9.
 

repoman27

Senior member
Dec 17, 2018
342
488
136
I should add that if Tigerlake is using chiplets, they would have an incentive to not hold it back if it's ready because of the better yield they would get.
Tiger Lake, at least as far as client parts are concerned, is the follow on to Ice Lake and is U/Y only. It's a monolithic, 10nm 4+2 die with 4 Willow Cove cores, GT2 Gen12/Xe graphics with up to 96 EUs, and integrated Thunderbolt 3/USB4. It's an ~150 mm^2 die all-in, and trying to break it into smaller pieces would pose way more challenges than could possibly be warranted.

If Tigerlake is not monolithic, then cutting the iGPU in half won't help them. I strongly suspect that, in light of Intel's yield problems on 10nm, Tigerlake will use EMIB to connect CPU and SoC die/iGPU in order to reduce die sizes and improve yields. It may just be that they can't reliably produce dice larger than 4c on 10nm in any significant quantity (see delays for Icelake-SP).

Rocketlake is a different issue altogether. Willow Cove implemented on 14nm++(+) as Rocket Lake would be an interesting product. Perhaps too little/too late given that it'll be a 2020 product, but still a step up from Coffeelake/Comet Lake.
There is no possible universe where TGL-U/Y are not monolithic, it simply makes zero sense financially and technically for Intel to do otherwise. There are already engineering samples of 38C 10nm ICL-SP XCC dies back from the fabs, no? (Not that they'll actually be able to yield well enough to make a profitable or even competitive product, but at least Intel wants everyone to know that they're attempting this.)

The only client chips we've seen on leaked roadmaps that even hinted at a multi-chip module strategy are Rocket Lake-U, which is the follow on to Comet Lake-U. RKL-U is a monolithic, 14nm 6+1 die with 6 Willow Cove cores and GT1 Gen12/Xe graphics with up to 32 EUs. It will be available in 4 and 6-core flavors with GT0 (no integrated graphics), GT0.5 (16 EUs), or GT1 (32 EUs). Implementing the full 96 EU GT2 Gen12 configuration from TGL isn't practical at 14nm, so RKL is GT1 only, but for customers who are willing to pay more for graphics, it looks like Intel is planning MCMs with a discrete Xe GPU called "DG1". This is a Gen12 LP part and probably a 10nm die with a very similar layout to TGL GT2. The TGL-U CPU die is rumored to have a direct PCIe 4.0 x4 connection—sort of a U-series version of PEG lanes. I reckon that RKL-U will have the same, and the DG1 die will be connected to the CPU via a PCIe 4.0 x4 link using standard substrate traces, but it will also include an HBM2 stack connected to the DG1 die via EMIB (Kaby Lake-G style). Mind you, Intel has no problem yielding massive parts on 14nm at this point, but if they make them too big, either nobody is going to be willing to pay for them and Intel will take a beating on margin, and/or they will continue to run into 14nm capacity issues.

Rocket Lake will also have H and S series parts, and it looks like 6+1 and 10+1 monolithic 14nm dies are planned. This will yield SKUs with 2, 4, 6, 8, and 10 cores with GT0 (no integrated graphics), GT0.5 (16 EUs), or GT1 (32 EUs) UHD Graphics. I'm guessing the "DG2" discrete GPU which is Gen12 HP with 128, 256, or 512 EUs is an option for pairing with an H die to create a Kaby Lake-G style MCM.

Yes but the scores on this site are really good so early in development, it beats the best i7-8565U devices on userbenchmark. The ES samples from Icelake 1 year before launch were much slower clocked, look at this from 11 months ago: https://www.userbenchmark.com/UserRun/10940550

10nm is clearly in much better shape now. I still wonder if Tigerlake is made on 10nm++ or really on 10nm+ as Intel claimed.
I think Intel is attempting to take a mulligan for their Cannon Lake / Palm Cove / Gen10 misadventure with 10nm and start over again. That makes Ice Lake 10nm and Tiger Lake 10nm+. They also attempted to shift journos to the less specific "14nm class" and "10nm class" terminology, because simply adding a plus every time they tweaked a process was clearly not forward thinking enough. But it would be interesting to know how they refer to the process internally, i.e. whether TGL is 1274.7, 1274.12, or some other version entirely.
 

Mopetar

Diamond Member
Jan 31, 2011
7,797
5,899
136
Intel has been working on getting to 10nm for long enough that I think their architecture can be designed toward the process to a better degree than you might typically get. If the new design is built to be far more efficient and run at lower clocks, I suspect some of that is as result of the new process necessitating such a design at this point. If the architecture is something that can scale well with frequency increases as the process matures (10nm+/10nm++) or if their future 7nm node isn't as wrought with problems than all the better for Intel.

This just shows that they've still got a capable design team. They still might have some issues with having to compete with large monolithic dies, but Intel has plenty of margins to eat into while figuring out how to best transition away from that.
 

Hitman928

Diamond Member
Apr 15, 2012
5,183
7,634
136
I don't work at that wretched clown show any more so the alphabet soup doesn't mean anything to me now, but I have a pretty good guess what "cache redesign" means and it is not performance related LOL.

But yeah afaik the upcoming coves are all practically the same core IP so whatever improvement will come from the memory/IO subsystem as opposed to the core itself. Peeling off codenames does not mean any work was actually done.

Density or power improvements for the cache redesign? I mean, is there another reason to go through the trouble?