Discussion Intel current and future Lakes & Rapids thread

Page 252 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

eek2121

Platinum Member
Aug 2, 2005
2,930
4,025
136
I posted this a while ago in this very thread, maybe it's time to look at the numbers again.

If we assume GC = 1.5x Skylake IPC and Gracemont = 1x Skylake IPC, SMT yields at 20%, let's compare throughput potential for 8+8 big.little, 10 big and 12 big:
Code:
8 big + 8 small (1x area)
8 x 1.5 x 1.2 = 14.4
8 x 1 = 8
Throughput @ 24T = 22.4
Throughput @ 16T = 20
Throughput @ 12T = 16

10 big (1x area)
10 x 1.5 x 1.2 = 18
Throughput @ 24T ~ 18
Throughput @ 16T = 16.8
Throughput @ 12T = 15.6

12 big (1.2X area)
12 x 1.5 x 1.2 = 21.6
Throughput @ 24T = 21.6
Throughput @ 16T = 19.2
Throughput @ 12T = 18

Based on the numbers above, these were my conclusions, with some highlights added this time:
  • 12T workloads would work just as well on 10 big as on 8+8
  • 8+8 will likely use only the big cores in gaming, pure 8 big core chips will be smaller and just as fast
  • 12 big can match 8+8 in throughput, incidentally this may look a lot like Alder Lake vs. Zen 4
On the topic of power savings and doing more within 125W:
  • Intel is currently pushing 150-200W through MCE enabled 14nm CPUs, why do we suddenly care about stringently adhering 125W TDP?
  • we currently don't know how small cores scale past 3Ghz, both in terms of fmax and power. If they can't efficiently clock past 4Ghz for example, that takes a lot of pressure off the pure big core chip.

Regarding TDP, we care because Intel is running in place while AMD is moving forward. The laws of physics dictate that IPC increases on 14nm will come at the cost of higher thermals as well as power consumption. Right now my PC with a Ryzen 3900X is sitting on my desk with a game open. Not a single fan in the system is running. I have a 280mm AIO, but the fans will only turn on at 50C. The CPU is averaging 44C and the GPU is in power saver mode (2D game). The PSU fan only turns on when the PSU gets warm, but it’s rated for 1200W. The fan never turns on.

I have been looking foreword to Intel’s 10nm for a long time. Unlike many people here and elsewhere, I believe that they have the capability to deliver intriguing new products, but they are currently being murdered from within. I have a wide variety of workloads so I have to buy the best CPU for those workloads. At the moment that CPU is an AMD one. The fact my machine is silent is just a bonus.
 
  • Like
Reactions: lightmanek

coercitiv

Diamond Member
Jan 24, 2014
6,186
11,852
136
Regarding TDP, we care because Intel is running in place while AMD is moving forward.
I know why we care in the larger sense, I was one of the few people who consistently criticized Intel for what they've done starting with 8th gen platform on 14nm. My question was narrowed to the current situation Intel faces with: on the desktop they're focused on maintaining performance lead on some workload types, while also trading power for MT performance to limit loss margin in other workloads. This is a bad strategy in mobile but can work in high performance desktops as long as performance is there. Hence, I really doubt Intel will limit Alder Lake S to 125W TDP on stock settings when they'll need every 5% of extra performance to claim a tie in performance or (maybe) even leadership in some workloads.

On desktops you fix performance first, power second - especially if thermals can be kept under control. (and they showed they can with the 10th gen)

Look at it the other way around - if the Golden Cove cores don't deliver an excellent power/performance curve, there's little Gracemont cores can do to save Alder Lake, especially if little cores can't clock high. Hence, what some people here are suggesting is Intel would have a better shot at the performance crown with 12 big cores while continuing to ignore OEMs messing up stock power settings.

The only reason I can think of the 8+8 arrangement making sense is 12 cores may not work on a ringbus, meaning they would need a 6+6 config (or mesh, or other interconnect), meaning they would need to catch up to what AMD has been doing since 2017 and will likely tune with Zen 3 & 4 to a point where latency sensitive workloads will not be a second citizen on their platform anymore. It seems to me this is more a problem of planning and design than a problem of node performance, power and hybrid efficiency in the desktop consumer space.
 
Last edited:

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
i7-1165G7 goes up to 4.7 GHz. How high can the 1185 go?

I don't think the 1185G7 is relevant here. It's something I believe is akin to 3617U/4600U/5600U, etc. It's an absolutely cream of the crop part that costs an extra $100 on top of what the already costly 3517U/4500U/5500U is at. That's putting aside the fact that a mere fraction of systems offer them. I've tried. Maybe about quarter of the systems offer it with the x6xx rather than x5xx.

All that for a max 100MHz gain which is easily dwarfed by suboptimal thermal and power management settings.

@coercitiv Your throughput numbers may be optimistic because it assumes Gracemont can clock same as Golden Cove. Considering Core cores clock nearly 2x the Atom cores at this point, its a lot to catch up.

One can hope for Golden Cove to cut few pipeline stages and reduce frequency to less insane levels(<4.8GHz). The higher resulting PPC would cancel out the reduced clocks.

Gracemont might clock significantly higher as well. Grand Ridge, which succeeds Snow Ridge for base stations, clocks at 2.6GHz, which is nearly 20% higher.
 

coercitiv

Diamond Member
Jan 24, 2014
6,186
11,852
136
@coercitiv Your throughput numbers may be optimistic because it assumes Gracemont can clock same as Golden Cove. Considering Core cores clock nearly 2x the Atom cores at this point, its a lot to catch up.
I know, have mentioned this after the estimates. However, considering I was leaning towards making a case for the big core layout, assuming "small cores can jump high" helps sanitize an otherwise clunky napkin math.

There's other big question marks too, like SMT support on the big cores and how that would interact with the OS scheduler.

One can hope for Golden Cove to cut few pipeline stages and reduce frequency to less insane levels(<4.8GHz). The higher resulting PPC would cancel out the reduced clocks.
Again, while this may hopefully be true, it only adds to the case of the pure pedigree chips on the desktop side.

The only thing that stops me from outright criticizing hibrid desktop chips is small cores finally getting a chance to play in the big leagues. I'm fully convinced Intel would have some excellent small core chips by now had they resisted the temptation to sideline small core development in favor of big core margins. If this is what it takes to get a decent low power architecture from Intel, then so be it: go desktop hybrids!
 

DrMrLordX

Lifer
Apr 27, 2000
21,617
10,826
136
Even if you assume Golden Cove being 50% faster per core, Gracemont should be faster with twice the amount of cores.

1). It doesn't really matter. Intel is currently pushing 10c CPUs that lose in total throughput to AMD chips from last year. Going from 10c Comet Lake (let's ignore Rocket for a moment here) to 12c Alder Lake (pure Golden Cove) would be a significant boost to throughput and ST performance, which would be a win-win for Intel. It would also be more fierce a competitor to AMD's 8c and 12c products than 8c Golden Cove + 8c Gracemont.

2). See below

  • 12 big can match 8+8 in throughput, incidentally this may look a lot like Alder Lake vs. Zen 4

I was basically thinking the same thing, though you put it better than I. Also if you remove Gracemont from the mix, now you can enable AVX512 assuming Golden Cove supports it.
 

jpiniero

Lifer
Oct 1, 2010
14,584
5,206
136
The only reason I can think of the 8+8 arrangement making sense is 12 cores may not work on a ringbus, meaning they would need a 6+6 config (or mesh, or other interconnect), meaning they would need to catch up to what AMD has been doing since 2017 and will likely tune with Zen 3 & 4 to a point where latency sensitive workloads will not be a second citizen on their platform anymore. It seems to me this is more a problem of planning and design than a problem of node performance, power and hybrid efficiency in the desktop consumer space.

Could be easily explained that Alder Lake only supports having one CPU chiplet, and they are obviously not going to do a 12+0 if they won't even do 8 big on mobile.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
I was basically thinking the same thing, though you put it better than I. Also if you remove Gracemont from the mix, now you can enable AVX512 assuming Golden Cove supports it.

AVX512 seems like a trivial matter when client chips won't add another unit for it. Sure, AVX-512 adds scenarios where things can be vectorized when previous AVX versions couldn't. But look at how long an instruction set takes to get adopted, and see how fragmented the AVX-512 landscape is.

The only client core that supports AVX-512 in any fashion is Icelake, a core that not only is available only in laptops but in mainstream to premium laptops and in limited quantities at that! Both AVX and AVX2 were introduced to products that would be used in desktops, laptops, and servers!

You can see in some scenarios when ISA is equalized as with Lakefield, Tremont is on the levels of Skylake in terms of performance per clock. With Gracemont we'll get whatever architectural gains it has plus the AVX2 support.

I'm fully convinced Intel would have some excellent small core chips by now had they resisted the temptation to sideline small core development in favor of big core margins.!

Most of the the gains big cores had above small cores back in the Skylake era were cancelled out by innumerable delays to 10nm and the decision not to update the Skylake core by even one iota, while Atoms went from Airmont to Goldmont, then Goldmont Plus, all in the same process generation and not even using the +!

An ideal hybrid chip is something that exaggerates the advantages. Rather than

8x Big, 1.5x SKL + 8x small, 1x SKL, 1x die

It should be
6x Big, 1.5x SKL + 16x small, 1x SKL, 1x die

But Ideally,
4x Big+, 2x SKL + 24x small, 1x SKL, 1.2x die

The cores on the Big+ would be literally as 2x large and power hungry than Big. This requires developing a third core as such a big core wouldn't work well in server which is throughput oriented.
 
Last edited:

NTMBK

Lifer
Nov 14, 2011
10,232
5,012
136
Yeah, Intel totally screwed up AVX adoption by limiting it to i3 and up. You can't rely on its presence, meaning you can't just flip the "compile with AVX2" flag on the compiler. You need to compile two different versions of the DLL, write manual DLL loading code, correctly detect AVX2 support, and load the correct version of the DLL.

The Intel compiler has a mode the will automate that for you- compiling multiple versions of the code, and selecting the right one at runtime- but you then need to pay for the Intel compiler. And of course it doesn't work with AMD processors.

It's a big headache that devs just don't need. Easier to just compile for SSE4 and not bother with the whole mess.
 
  • Like
Reactions: Thunder 57

repoman27

Senior member
Dec 17, 2018
342
488
136
Chart says both are 3.0.
That's referring to the I/O die capabilities. Thunderbolt 4 and PCIe 4.0 x4 are integrated into the CPU die, not the I/O die.

Intel already started moving away from U/Y terminology with Ice Lake, instead referring to them as U package Type 3 and U package Type 4, hence UP3 and UP4. Y was a 4.5 W TDP platform targeting tablets and fanless designs. ICL-Y was supposed to come in slightly higher at 5.2 W with the move to 4 cores and integrated Thunderbolt 3. Instead, the only fully enabled ICL-Y SKUs to actually ship are 10 W. That's not really a Y part anymore, and Intel is probably trying to obfuscate the fact that they essentially missed their low power target by 100% and consequently have no suitable replacement for designs based on their traditional Y platform.

So UP3 is the traditional U available in 15 and 28 W TDPs with a range of cTDP values. UP4 looks like Y physically and comes with similarly cut down I/O, but is 9 W TDP with cTDP values that now overlap with regular U parts.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Yeah, Intel totally screwed up AVX adoption by limiting it to i3 and up. You can't rely on its presence, meaning you can't just flip the "compile with AVX2" flag on the compiler. You need to compile two different versions of the DLL, write manual DLL loading code, correctly detect AVX2 support, and load the correct version of the DLL.

Maybe in the Gracemont generation we'll see Celeron/Pentium having AVX2. Unless they want to have it enabled for only Alderlake, which is another level of silliness.

In the surface there's nothing wrong with excess segmentation. But it invites competitors to attack it as its a weak point. So its another form of sacrificing long term gains for short term.
 

jpiniero

Lifer
Oct 1, 2010
14,584
5,206
136
Yeah, Intel totally screwed up AVX adoption by limiting it to i3 and up. You can't rely on its presence, meaning you can't just flip the "compile with AVX2" flag on the compiler. You need to compile two different versions of the DLL, write manual DLL loading code, correctly detect AVX2 support, and load the correct version of the DLL.

Core Celeron and Pentium only exist to dump bottom tier quality dies on. Disabling AVX only helps with getting every last die still functional to qualify. I could see Intel keeping AVX disabled on Alder Lake Celerons and Pentiums for the same reason.
 

NTMBK

Lifer
Nov 14, 2011
10,232
5,012
136
Core Celeron and Pentium only exist to dump bottom tier quality dies on. Disabling AVX only helps with getting every last die still functional to qualify. I could see Intel keeping AVX disabled on Alder Lake Celerons and Pentiums for the same reason.

I don't think that there are many dies with a fault that is precise enough to take out AVX, while leaving the rest of the FPU functional. I'm sure it happens... But not in the volumes that Intel sells Pentiums and Celerons.
 

NTMBK

Lifer
Nov 14, 2011
10,232
5,012
136
Maybe in the Gracemont generation we'll see Celeron/Pentium having AVX2. Unless they want to have it enabled for only Alderlake, which is another level of silliness.

In the surface there's nothing wrong with excess segmentation. But it invites competitors to attack it as its a weak point. So its another form of sacrificing long term gains for short term.

Even if they had no viable competitors, it's still incredibly dumb. You deter developers from making use of the features, which makes your new chips look much less impressive. Why bother upgrading from that old Sandy Bridge, if none of the games you play make use of AVX/AVX2? It could have been a good way for Intel to make their new chips clearly superior, but their marketing department screwed it up.
 
  • Like
Reactions: Thunder 57

jpiniero

Lifer
Oct 1, 2010
14,584
5,206
136
I don't think that there are many dies with a fault that is precise enough to take out AVX, while leaving the rest of the FPU functional. I'm sure it happens... But not in the volumes that Intel sells Pentiums and Celerons.

It's not the defect in the AVX unit, it's the additional power that AVX draws. On an extremely leaky die, it could make a difference in hitting the target TDP. It also acts to discourage buying it over the i3. You could say that it would only encourage people to buy AMD instead but as long as Intel takes out the trash, they might be OK with that.

It's the kind of thing where if Intel weren't getting these kinds of dies, Core Celeron and Pentium wouldn't exist.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Why bother upgrading from that old Sandy Bridge, if none of the games you play make use of AVX/AVX2? It could have been a good way for Intel to make their new chips clearly superior, but their marketing department screwed it up.

Sigh. I know.

Revenue-wise they've been stagnant for years until 2009 with Nehalem when the Core ix branding was used. It's difficult to understand as a person that's technically apt(most of us here) but in the minds of a person that's not its very different. You can overhear comments about people talking about it and for them a Core i7 is a total beast compared to an i5 or an i3. The change to the marketing and brand proved to be extremely effective and they went on to increase revenue by 20% that year.

Clearly that's influenced by Paul Otellini's leadership. While a single person, even a CEO's impact isn't immediately felt, the momentum builds up over time and sometimes make critical decisions that have long-lasting impacts. Although he could be credited to them getting out of the slump with Core 2, Otellini is also the reason why they didn't get into iPhones and iPads, and thus losing any chance at the mobile market.

Apparently the reason is he didn't think it would attain the volume necessary to justify the price per SoC Apple wanted. Did he not see the vision because of his line of thought instilled by MBA and finance training? I could even say their loss in process leadership came as a result of his ill-fated decision. Up until then, no company in the world came even close to their manufacturing prowess, and I mean by volume. The connection between leadership and volume is loosely connected but has to do with top people gravitating towards the leader.

Some say as companies get larger it gets more bureaucratic and becomes slow and bloated, similar to governments.

It's the kind of thing where if Intel weren't getting these kinds of dies, Core Celeron and Pentium wouldn't exist.

I don't think that factors into the decision as much as you think it does. Some small fraction may not have working AVX units but Intel was quite famous for getting high yields. You have to be, if you are selling 200 millon plus devices a year.

The segmentation is to create further space between each lineup. It doesn't help that they sell 20+ different SKUs. It's the same reason why years ago they said they'll keep a certain gap between Atom and Core. The gap between the two are far, far closer now, since they have no choice with ARM catching up.
 

jpiniero

Lifer
Oct 1, 2010
14,584
5,206
136
I don't think that factors into the decision as much as you think it does. Some small fraction may not have working AVX units but Intel was quite famous for getting high yields. You have to be, if you are selling 200 millon plus devices a year.

We're only talking about a couple percent of Core sales at best. At the volumes that Intel does, it's worth it to not just throw it in the trash can.
 

DrMrLordX

Lifer
Apr 27, 2000
21,617
10,826
136
The only client core that supports AVX-512 in any fashion is Icelake

Cannonlake, IceLake, and TigerLake support it. Cannonlake was largely irrelevant; nevertheless, Intel has made an effort to support AVX512 on client cores at the design level. Sadly, they're struggling to produce and sell any of those designs. Golden Cove represents the fourth generation of Intel cores to support AVX512 across their entire product lineup.