Question Raptor Lake - Official Thread

Hulk

Diamond Member
Oct 9, 1999
4,187
1,970
136
Since we already have the first Raptor Lake leak I'm thinking it should have it's own thread.
What do we know so far?
From Anandtech's Intel Process Roadmap articles from July:

Built on Intel 7 with upgraded FinFET
10-15% PPW (performance-per-watt)
Last non-tiled consumer CPU as Meteor Lake will be tiled

I'm guessing this will be a minor update to ADL with just a few microarchitecture changes to the cores. The larger change will be the new process refinement allowing 8+16 at the top of the stack.

Will it work with current z690 motherboards? If yes then that could be a major selling point for people to move to ADL rather than wait.
 
  • Like
Reactions: vstar

CakeMonster

Golden Member
Nov 22, 2012
1,382
475
136
Pretty much guaranteed that it will work with current motherboards.

Not sure how much of a selling point that is though, unless you for some reason are convinced that its rational for you to get a low end AL and then high end RL, 'justifying' the upgrade. Switching CPU's from one year to the next with only incremental hardware changes never made sense to me. I can't remember the last time I didn't upgrade MB+CPU+RAM simultaneously.
 

Exist50

Platinum Member
Aug 18, 2016
2,445
3,043
136
Should just be a minor refresh. Higher peaks clocks and more cores. Would be nice if the mainstream die were to get 8 GRT cores as well.
 

eek2121

Platinum Member
Aug 2, 2005
2,883
3,860
136
Should just be a minor refresh. Higher peaks clocks and more cores. Would be nice if the mainstream die were to get 8 GRT cores as well.

I doubt we will see higher clocks. Higher IPC for the “P” cores, double the “E” cores. I’m anticipating ~10-15% single core and ~30-40% multi core uplift (depending on the workload) personally.
 

Khato

Golden Member
Jul 15, 2001
1,199
232
106
I believe the other three changes of note for RPL are increased L2 cache (may be desktop only), better-tuned DDR5 memory controller, and DLVR (may only be enabled on mobile due to lack of support on socket 1700.)
 

Hulk

Diamond Member
Oct 9, 1999
4,187
1,970
136
I doubt we will see higher clocks. Higher IPC for the “P” cores, double the “E” cores. I’m anticipating ~10-15% single core and ~30-40% multi core uplift (depending on the workload) personally.

I agree about the higher clocks not being practically possible. Maybe efficiency could increase at current clocks. Or put more succinctly the linear part of the V vs. F graph could be extended a bit so as clocks approach 5GHz power/thermals don't go nuts.

As for IPC I would love to see 10-15% but we're already 6 wide on decode and 12 on execution. I doubt they are going to make major architectural changes so I see breaking 10% a pretty high bar. Generally fine tuning the memory subsystem can get 5 or 6% so I can see that happening. But who knows? Maybe they're getting aggressive and I'm thinking the lazy Intel we've had for the past 10 years.

MT increase will be easy to come by if they can squeeze 2 more Gracemont clusters in there. But then again it's going to start really stressing memory bandwidth as the core count goes up.
 
  • Like
Reactions: Drazick

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
I believe the other three changes of note for RPL are increased L2 cache (may be desktop only), better-tuned DDR5 memory controller, and DLVR (may only be enabled on mobile due to lack of support on socket 1700.)

I expect DLVR for mobile as well just like with the leak. Lots of the fine-grained power management advances are lost when put in a desktop chip clocking in the 4+GHz levels. Oftentimes they turn to be a hindrance.

There's the site that did comprehensive analysis of the Golden Cove core and it concluded while it is a strong core, there are weaknesses that can be easily fixed. So the gain won't be big but maybe it'll "fix" those and get 5-10% out of it.

I don't expect much more than that as Golden Cove is already about 10% faster than Zen 3 so another 10% or so will get it more or less on par even with Zen 4's rumored 20-25% gain. In MT the extra clusters will help it perform better and with better perf/watt.

I see Raptor Lake not as a refresh but something that fully fleshes out Alderlake.

Desktop only different L2 cache doesn't make sense because L2 caches are private and require core layout changes. Core 2 was the last generation where L2 caches weren't private. Since they moved to a 3-tier cache system, L2 caches are private per core.

Switching CPU's from one year to the next with only incremental hardware changes never made sense to me. I can't remember the last time I didn't upgrade MB+CPU+RAM simultaneously.

For most people even Pentium D to Core 2 Duo shouldn't make sense. Unless they are socket compatible.

From Sandy Bridge I'd upgrade to 12600K, because the rest are much more expensive. The chips in between aren't worth it since they are all Skylake. I guess if you need it early 3900X would work.

If you have Alderlake I'd wait until Nova Lake, not even Lunar Lake. You better have the systems paying you money for the extra performance, because otherwise it's all about wanting, not needing.
 
Last edited:
  • Like
Reactions: Carfax83

Khato

Golden Member
Jul 15, 2001
1,199
232
106
I expect DLVR for mobile as well just like with the leak. Lots of the fine-grained power management advances are lost when put in a desktop chip clocking in the 4+GHz levels. Oftentimes they turn to be a hindrance.
Whereas my guess is that we'll see DLVR on MTL desktop and beyond. DLVR seems to be a refinement of the FIVR approach which retains the ability to maintain voltage on quick transients while avoiding the many problems inherent to FIVR. It still might not be sized such that it can be effective in extreme overclocking, but it should be trivial to bypass for those cases. For everything else I don't see a downside to effectively moving LLC on-die.

Desktop only different L2 cache doesn't make sense because L2 caches are private and require core layout changes. Core 2 was the last generation where L2 caches weren't private. Since they moved to a 3-tier cache system, L2 caches are private per core.
Well, Intel clearly saw adequate benefit to increase L2 cache for the SPR variant of golden cove. No question that it's additional layout effort, but maybe it's worth that effort?
 

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
I don't expect much more than that as Golden Cove is already about 10% faster than Zen 3 so another 10% or so will get it more or less on par even with Zen 4's rumored 20-25% gain. In MT the extra clusters will help it perform better and with better perf/watt.

Isn't it more than 10%? Willow Cove was said to have the same IPC as Zen 3, and Golden Cove is definitely more than 10% faster than Willow Cove.

I can't recall any hardware reviewers doing in depth tests for IPC, other than using Cinebench at identical clock speeds. IPC seems to be really difficult to measure properly, because it varies so much across workloads.

One workload where Alder Lake really surprised me in its performance was in encoding; specifically AV1, the next standard. From what I can gather, encoding is pure integer am I correct?

Here the 12600K is nearly 50% faster than the 5600x, and the 12900K manages to edge out 5950x. This to me solidified how potent the Golden Cove core is, because this is a hard workload for any CPU.

This graph was using Windows 10, as it had much higher FPS than the Windows 11 one for both AMD and Intel.

G9bvMVM3eaDqdy9vYAdR4j-2560-80.png


Golden Cove is also very potent in code compilation, which from what I gather is a great test of IPC as well:

embed.php
 
Last edited:
  • Like
Reactions: controlflow

Exist50

Platinum Member
Aug 18, 2016
2,445
3,043
136
I doubt we will see higher clocks. Higher IPC for the “P” cores, double the “E” cores. I’m anticipating ~10-15% single core and ~30-40% multi core uplift (depending on the workload) personally.

I think that single core prediction is extremely optimistic, assuming little change to frequency. I'm expecting ≤5% IPC, and a similar increase in peak clocks. Maybe 5.4-5.5GHz TVB on the P core.

Remember that this is a stopgap generation that only exists because of Meteor Lake delays. Basically the Comet/Coffee Lake of this gen, but with slightly more effort put it. Expecting anything close to a proper gen/gen scaling is, I believe, unrealistic.
 

tamz_msc

Diamond Member
Jan 5, 2017
3,698
3,547
136
Isn't it more than 10%? Willow Cove was said to have the same IPC as Zen 3, and Golden Cove is definitely more than 10% faster than Willow Cove.
Sunny, Willow and Cypress cove are, barring cache and process differences, the same core design, and are slightly slower than Zen 3 cores in Desktop (Vermeer).
 
  • Like
Reactions: HurleyBird

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
Sunny, Willow and Cypress cove are, barring cache and process differences, the same core design, and are slightly slower than Zen 3 cores in Desktop (Vermeer).

Sunny and Willow Cove are the same mobile oriented core design with different cache configurations, but Cypress Cove is a scaled back version of Sunny Cove due to being backported to 14nm so it never achieved its full performance range.

When you compare Tiger Lake H models against their Zen 3 mobile counterparts, they trade blows and Zen 3 isn't decisively faster. If Intel had been able to make a desktop version of the full Willow Cove core, it would have competed well against Vermeer but obviously lose out on performance per watt.
 

tamz_msc

Diamond Member
Jan 5, 2017
3,698
3,547
136
Sunny and Willow Cove are the same mobile oriented core design with different cache configurations, but Cypress Cove is a scaled back version of Sunny Cove due to being backported to 14nm so it never achieved its full performance range.
There is nothing scaled-back in Cypress Cove, in terms of the core structures, compared to Sunny Cove. Only the physical layout, routing, wire lengths, etc. are different because it's fabbed on 14nm. It does achieve +19% IPC over Comet Lake, same as what Sunny Cove achieves over Whiskey Lake, in benchmarks like SPEC and Geekbench. The reason why it doesn't do as well in other areas, like games, is because of the increased latency that was a negative consequence of the backport and the memory controller not capable of running >3733 MT/s memory in gear 1 mode.
When you compare Tiger Lake H models against their Zen 3 mobile counterparts, they trade blows and Zen 3 isn't decisively faster.
Zen 3 on mobile has halved L3 cache and its memory controller likes to downclock in CPU-only workloads, causing reduced performance compared to Vermeer.
 
  • Like
Reactions: HurleyBird and mikk

Hulk

Diamond Member
Oct 9, 1999
4,187
1,970
136
There's the site that did comprehensive analysis of the Golden Cove core and it concluded while it is a strong core, there are weaknesses that can be easily fixed. So the gain won't be big but maybe it'll "fix" those and get 5-10% out of it.

We have gone from 1 decode with the 486 and earlier, 2 decode "superscaler" with the Pentium, 3 decode with Core, 4 wide with Haswell, 5 with Skylake, and now 6 with Golden Cove. The back end has more or less kept up to keep things moving. As I understand it the Out-of-Order Scheduler and other supporting devices allow instructions that would normally be executed one at a time to be executed in parallel to the extent of the instruction level parallelism exists in the code and the "smarts" built into the processor to extract this parallelism. I read the article you mentioned above and one area where Zen 3 is ahead of Golden Cove is branch prediction.

What I'm wondering is the following. I'm directing this at IntelUser2000 but of course I'd like to hear from anyone with the experience/knowledge to take crack at these!

1. Am I correct in writing the that original reason for HT or SMT is to utilize all of the unused resources in the CPU? With GC able to under the best case situation execute 6 instructions at a time, unless the code has a lot of parallelism there will be plenty of resources left to run more than one thread.

2. If the same application is tested on a 4-wide Haswell, 5-wide Skylake, and 6-wide Golden Cove and that software is highly multithreaded, like Cinebench R23, would you expect the additional performance due to HT to increase with the width of the CPU?

3. Do we know on average how many instructions per cycle a CPU is executing? Obviously the bounds for Golden Cove are 1 and 6, but when running Cinebench R23 ST on average how many instructions do you think are being executed per cycle? 3? 4? 4.5?

4. Does Amhdal's law apply to the width of a CPU in the same way it applies to multicore CPU's? Are we reaching a point of very small payback as we increase the width beyond 6 decode?

5. Gracemont is as wide as Golden Cove on the front end and wider on the back end as compared to Golden Cove yet has much lower throughput. Is this primarily because the Out-of-Order intelligence isn't as sophisticated as GC? If not then why?

6. I realize I have asked you this before but it's still bugging me. Seeing that Gracemont is so wide and it's primary purpose is MT compute, wouldn't the addition of HT provide a large benefit for the amount of additional structures required to implement HT?

7. I'm having a hard time wrapping my head around why Gracemont is so wide? 6 x 17, seems crazy?
 
  • Like
Reactions: igor_kavinski

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
There is nothing scaled-back in Cypress Cove, in terms of the core structures, compared to Sunny Cove. Only the physical layout, routing, wire lengths, etc. are different because it's fabbed on 14nm. It does achieve +19% IPC over Comet Lake, same as what Sunny Cove achieves over Whiskey Lake, in benchmarks like SPEC and Geekbench. The reason why it doesn't do as well in other areas, like games, is because of the increased latency that was a negative consequence of the backport and the memory controller not capable of running >3733 MT/s memory in gear 1 mode.

OK I agree that you were right about there not being any core structures being scaled back for Cypress Cove, but from your statement it seems you also agree that Cypress Cove took a performance hit from the backport (due to increased latency in the design), which is really what I was getting at. I assumed Cypress Cove was actually a slightly scaled down version of Sunny Cove to explain its mediocre performance.

This is illustrated in this chart, where Tiger Lake H had edged out the 11900K in SpecINT despite running at a much lower wattage and with mobile platform TDP. If Intel had released a desktop version of Tiger Lake, it would have likely outperformed Vermeer in single and lightly threaded workloads.

117493.png


Zen 3 on mobile has halved L3 cache and its memory controller likes to downclock in CPU-only workloads, causing reduced performance compared to Vermeer.

See the above chart. The 11980HK isn't that far behind the 5950x, despite being limited on a mobile platform.

That said, Golden Cove is in another league. I don't think Spec is a good indication of CPU performance of real world performance, judging by the comparatively minor differences between CPUs versus the absolutely massive ones in heavy workloads like the compilation and encoding benchmarks I posted above. The 12600K just destroys the 5600x and 11600K in both of those benchmarks by large margins.
 
Last edited:

tamz_msc

Diamond Member
Jan 5, 2017
3,698
3,547
136
This is illustrated in this chart, where Tiger Lake H had edged out the 11900K in SpecINT despite running at a much lower wattage and with mobile platform TDP. If Intel had released a desktop version of Tiger Lake, it would have likely outperformed Vermeer in single and lightly threaded workloads.
It wouldn't have. The 11980HK has max ST turbo of 5.0 GHz. If it was on a desktop with RKL-like ST clock speeds - i.e. 5.3 GHz, and if performance scaled linearly, then 6.91*5.3/5 = 7.32 in SPECint_2017 ST while the 5950X in that chart achieves 7.65 which is still ~4% ahead of a theoretical TGL pushed to desktop like specs.
I don't think Spec is a good indication of CPU performance of real world performance, judging by the comparatively minor differences between CPUs versus the absolutely massive ones in heavy workloads like the compilation and encoding benchmarks I posted above.
You're comparing the differences in a ST benchmark with a multi-threaded benchmark like compiling. Golden Cove is impressive, mostly because the philosophy behind it seems to be "make everything bigger", though there are still things that Zen 3 does better, like branch prediction for example.
 
  • Like
Reactions: Lodix

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
It wouldn't have. The 11980HK has max ST turbo of 5.0 GHz. If it was on a desktop with RKL-like ST clock speeds - i.e. 5.3 GHz, and if performance scaled linearly, then 6.91*5.3/5 = 7.32 in SPECint_2017 ST while the 5950X in that chart achieves 7.65 which is still ~4% ahead of a theoretical TGL pushed to desktop like specs.

Well perhaps you're correct about Spec in this particular instance, but I still disagree when it comes to real world workloads. In real world workloads, they would trade blows.

One thing I've come to realize is that IPC is a very nebulous term and rather than being fixed like most people would probably assume, it's actually quite dynamic and very app dependent.

It makes quoting IPC performance figures practically useless when I think about it, especially when you compare across completely different architectures. I think IPC as a metric would be more useful for similar architectures, ie Golden Cove and Raptor Cove.

You're comparing the differences in a ST benchmark with a multi-threaded benchmark like compiling. Golden Cove is impressive, mostly because the philosophy behind it seems to be "make everything bigger", though there are still things that Zen 3 does better, like branch prediction for example.

If you look at the SPEC_rate benchmarks on the next page, Golden Cove and Vermeer are still very close to each other. Spec to me is just garbage, and I don't see why Anandtech places such high importance on it. Real world workloads almost invariably show much greater differentiation between the two cores.

117496.png
 

Exist50

Platinum Member
Aug 18, 2016
2,445
3,043
136
OK I agree that you were right about there not being any core structures being scaled back for Cypress Cove, but from your statement it seems you also agree that Cypress Cove took a performance hit from the backport (due to increased latency in the design), which is really what I was getting at. I assumed Cypress Cove was actually a slightly scaled down version of Sunny Cove to explain its mediocre performance.

There is no change in latency or other behavior within the core. It's the fabric that's worse, primarily the memory controller latency.
 

tamz_msc

Diamond Member
Jan 5, 2017
3,698
3,547
136
If you look at the SPEC_rate benchmarks on the next page, Golden Cove and Vermeer are still very close to each other. Spec to me is just garbage, and I don't see why Anandtech places such high importance on it. Real world workloads almost invariably show much greater differentiation between the two cores.
You can't use SPEC ST in one instance and SPEC MT (which is just rate-N) in another instance to conclude that SPEC is garbage. SPEC is similar to Geekbench - both of them test real-world workloads.

Golden Cove is around 15% higher IPC than Zen 3 in Geekbench, when you equip both platforms with fast RAM. The reason why the difference is much lower on SPEC in the Anandtech graphs is because they test with bog-standard JEDEC-spec memory.
 
  • Like
Reactions: Lodix

dullard

Elite Member
May 21, 2001
24,964
3,301
126
I believe the other three changes of note for RPL are increased L2 cache (may be desktop only), better-tuned DDR5 memory controller, and DLVR (may only be enabled on mobile due to lack of support on socket 1700.)
I think you nailed it. The key is this old rumor where DLVR is on the mobile section, and not on the desktop section. But in the other generations a new feature was listed in both mobile and desktop. DLVR is supposed to help mostly in low power situations, so while it helps desktop's power usage, it really helps mobile the most.
1638809474286.png

If Meteor Lake does turn out to be mobile only, then Raptor Lake is what Intel has on desktop for ~12 months until Arrow Lake comes around. Compared to Alder Lake:
  1. Higher top speed (5.5 GHz rumored),
  2. Lower PL2 (228 W vs 241 W),
  3. Slightly more efficient due to the combination of the two above,
  4. More E cores,
  5. More cache,
  6. Better memory support (DDR5-5600 vs DDR5-4800).
 
Last edited:

Khato

Golden Member
Jul 15, 2001
1,199
232
106
I think you nailed it. The key is this old rumor where DLVR is on the mobile section, and not on the desktop section. But in the other generations a new feature was listed in both mobile and desktop. DLVR is supposed to help mostly in low power situations, so while it helps desktop's power usage, it really helps mobile the most.
I guess it depends on the definition of 'low power situations'. I believe the idea of DLVR helping most in low power is due to the graphs in the patent which show the most voltage/power reduction in the 0-40A range followed by a linear decrease up to 70A. That 70A figure is representative of absolute worst-case power virus which no real workload is likely to achieve. Heavy compute workloads are probably still going to be somewhere in the linear decrease region, but moderate workloads may well be pretty close to the peak savings region.

We just know it's not going to be enabled on desktop RPL because it would break socket compatibility... Well, maybe... technically Intel could reassign some Vcc pins to the DLVR input voltage and only have the motherboard enable the new voltage regulator if instructed. If motherboard doesn't have the extra regulator then everything would just operate in legacy mode as if the DLVR wasn't even there.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Well, Intel clearly saw adequate benefit to increase L2 cache for the SPR variant of golden cove. No question that it's additional layout effort, but maybe it's worth that effort?

You see the additional years it takes them to release the server parts? Well that's just with AVX512 and extra L2 "bolted" on top. You can clearly see from the Skylake-SP shot that it doesn't even change the layout of the 256KB portion - they just add the 1MB on that side.

Also their server division grew enough to be a substantial portion of revenue. Sure their laptop division is large, but desktop is much smaller. And you are talking about enthusiast K market and they'll need to design pretty much just for them. We've argued about whether they need a third core to separate client into two. Perhaps they will if they get their chips so good that they can get back into mobile again.

We have gone from 1 decode with the 486 and earlier, 2 decode "superscaler" with the Pentium, 3 decode with Core, 4 wide with Haswell, 5 with Skylake, and now 6 with Golden Cove.

Intel chips have been 3-wide since the Pentium Pro/II. The first 4-wide Intel chip was the Core 2. Haswell extends some things but didn't change anything big, hence the relatively small improvement.

Skylake claimed 5-issue but I think that's with fusion. Golden Cove slide says they went from 4 to 6, and Agner Fog says despite what the Intel manual says he couldn't get above 4.

1. Am I correct in writing the that original reason for HT or SMT is to utilize all of the unused resources in the CPU? With GC able to under the best case situation execute 6 instructions at a time, unless the code has a lot of parallelism there will be plenty of resources left to run more than one thread.

2. ....would you expect the additional performance due to HT to increase with the width of the CPU?

Yes, but some of the extra gains will be mitigated because of other parts that improve ILP such as improved branch prediction and larger OoOE resources.

3. Do we know on average how many instructions per cycle a CPU is executing? Obviously the bounds for Golden Cove are 1 and 6, but when running Cinebench R23 ST on average how many instructions do you think are being executed per cycle? 3? 4? 4.5?

You can go lot under 1. Transactional benchmarks benefit a lot from SMT for the same reason.

4. Does Amhdal's law apply to the width of a CPU in the same way it applies to multicore CPU's? Are we reaching a point of very small payback as we increase the width beyond 6 decode?

It's different. The wider issue works out because there are multiple instruction streams. Out of order is what allowed superscalar to be effective, since it speculatively allows second stream to go before the first one is done. So if that's easy to do, then they will scale pretty much infinite. But there are code that are fundamentally limited. Rather than having an Amdahl's limit you will simply run into more and more scenarios where it won't scale cause you won't be able to break down the code to take advantage of increased width. But as long as the other parts get wider, smarter, and better performance will increase.

Code sizes also continue to grow as well.

5. Gracemont is as wide as Golden Cove on the front end and wider on the back end as compared to Golden Cove yet has much lower throughput. Is this primarily because the Out-of-Order intelligence isn't as sophisticated as GC? If not then why?

Deeper BTB buffers, faster execution units, uop cache in addition to the traditional pipeline(although the lower number of stages on Gracemont makes up somewhat), better Load/Store capabilities, larger buffers both for OoO and execution units.

So lot of them are details that are not/can't be shown in powerpoint. Gracemont may have more dedicated ports but they are simpler and dedicated to the task. When it comes to instruction latency Golden Cove likely has lower latency and higher throughput. Like how Pentium 4 had double clocked ALUs but for simple instructions.
 
  • Like
Reactions: lightmanek and Hulk

eek2121

Platinum Member
Aug 2, 2005
2,883
3,860
136
I think you nailed it. The key is this old rumor where DLVR is on the mobile section, and not on the desktop section. But in the other generations a new feature was listed in both mobile and desktop. DLVR is supposed to help mostly in low power situations, so while it helps desktop's power usage, it really helps mobile the most.
View attachment 53926

If Meteor Lake does turn out to be mobile only, then Raptor Lake is what Intel has on desktop for ~12 months until Arrow Lake comes around. Compared to Alder Lake:
  1. Higher top speed (5.5 GHz rumored),
  2. Lower PL2 (228 W vs 241 W),
  3. Slightly more efficient due to the combination of the two above,
  4. More E cores,
  5. More cache,
  6. Better memory support (DDR5-5600 vs DDR5-4800).

I keep seeing people post that they think clocks will go higher in the future. Not gonna happen.

Thermal density is a huge issue. Clocks will likely regress slightly.
 

Hulk

Diamond Member
Oct 9, 1999
4,187
1,970
136
AMD may be trying to laugh off those E's right now but they are going to be a thorn in their side I predict. 8+16 Raptor Lake assuming no IPC improvements is going to score around 35000 in CB R23. That would require a 33.5% boost in performance for Zen 3 for the 6950 (or whatever) just to match it. With the tiny amount of die area a Gracemont cluster requires for the MT compute it provides it's going to be relatively easy for Intel to add clusters should they need MT performance.

In addition, Intel can increase Golden Cove IPC by adding more transistors, which can be recovered by the E's in a sense. The hybrid approach gives them flexibility to compete moving forward.
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,439
14,409
136
AMD may be trying to laugh off those E's right now but they are going to be a thorn in their side I predict. 8+16 Raptor Lake assuming no IPC improvements is going to score around 35000 in CB R23. That would require a 33.5% boost in performance for Zen 3 for the 6950 (or whatever) just to match it. With the tiny amount of die area a Gracemont cluster requires for the MT compute it provides it's going to be relatively easy for Intel to add clusters should they need MT performance.

In addition, Intel can increase Golden Cove IPC by adding more transistors, which can be recovered by the E's in a sense. The hybrid approach gives them flexibility to compete moving forward.
Not sure here... While its nice to get more MT performance, it all depends on the task. Case in point (and I am sure there are other real world examples) when I do DC work, it creates 32 threads (32 individual tasks working on 32 different problems) for a 5950x. if 16 of those were gracemont, it would not do as well. Also, there is scheduling, which I am sure will not get totally worked out for years.

I think that there will be a lot of changes in both camps, and alder lake is certainly a nice step forward, I am just not convinced its the future. CB23 is one application, you need to compare a lot more applications to evaluate.
 
  • Like
Reactions: Leeea and Drazick