Discussion Intel’s Unified Core: There is hope

Page 3 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Geddagod

Golden Member
Dec 28, 2021
1,378
1,461
106
They have HX for that
But if the SOC die is different like it is rn, they lose a bunch of battery life, so Intel would have to figure something out for that.
But my main point there was that the 8+16 N2 die looks like it is going to get a lot of use lol.
Where they can simply swallow the cost
Exactly, but other segments can't swallow the cost. The reason Intel is using 18A-P there is purely a financial reason, not because of Intel 18A-P being close to N2.
Who is going to sponsor them they don't have too much money left if they want to ramp quickly they need money they are just giving them the maximum flexibility in terms of operation and cost ramping a fab cost $$$ they don't have much money.
If this is the case, the IO die, iGPU die, and Wildcat Lake can all be shifted over to different nodes or have their plans changed to accommodate 8+16 dies on 18A-P.
Intel is wasting a bunch of money, and is also getting a bunch of bad investor press, by going external. So they have the money to do large payments to TSMC to use their N2 node, but not enough to expand capacity for 18A? You aren't even building whole new fabs, all you are doing is expanding capacity.
Capacity reason makes 0 sense. Even Intel isn't claiming this is the case- it's performance and timing apparently.
And the timing reason is BS too...
1753042969690.png
Wild cat is like 70mm2 at max lol with 2+4 config.
It's low end, high volume product.
Also a 8+16 tile isn't all that much larger than that estimate (no idea how accurate it is).
 
  • Like
Reactions: Joe NYC

Philste

Senior member
Oct 13, 2023
296
474
96
No? It's a slide for Nova Lake S.
Oh come on, you really gonna claim that Computerbase is straightup lying to their readers and that Intel puts out kinda polished looking slides with footnotes for internal use?

NVL is barely taped out yet and they are doing performance claims? Claim "leadership gaming performance" on a product that's far from final clockrates? Also they are 35% back in gaming lol. Also "new low power island" as one of the most important claims for the new Desktop CPU?

Don't get me wrong, i don't want to start any Hypetrain or whatever by saying this isn't for NVL. NVL might end up at the same 1.1 ST. But this slide has 0 indication of being Vasen on NVL. Instead it fits kinda good to PTL. They probably will still do a tiny PTL Paperlaunch end of this year so it would make sense to have such a polished looking slides out by now.
 
  • Like
Reactions: Io Magnesso

Kepler_L2

Senior member
Sep 6, 2020
890
3,629
136
Oh come on, you really gonna claim that Computerbase is straightup lying to their readers and that Intel puts out kinda polished looking slides with footnotes for internal use?

NVL is barely taped out yet and they are doing performance claims? Claim "leadership gaming performance" on a product that's far from final clockrates? Also they are 35% back in gaming lol. Also "new low power island" as one of the most important claims for the new Desktop CPU?

Don't get me wrong, i don't want to start any Hypetrain or whatever by saying this isn't for NVL. NVL might end up at the same 1.1 ST. But this slide has 0 indication of being Vasen on NVL. Instead it fits kinda good to PTL. They probably will still do a tiny PTL Paperlaunch end of this year so it would make sense to have such a polished looking slides out by now.
It's literally part of a presentation sent to AIBs which included a picture of Nova Lake-S.
 
  • Wow
Reactions: Joe NYC

Doug S

Diamond Member
Feb 8, 2020
3,309
5,753
136
Who is going to sponsor them they don't have too much money left if they want to ramp quickly they need money they are just giving them the maximum flexibility in terms of operation and cost ramping a fab cost $$$ they don't have much money.
Wild cat is like 70mm2 at max lol with 2+4 config.

Who needs to "sponsor" them if they were adding more capacity in response to more demand? More demand means they'd be generating more profit.

One can assume if they are talking about being able to ramp quickly they aren't talking about breaking ground on a new fab or even expanding an existing fab, but using fab space they already have fully finished and have mostly equipped.
 

Geddagod

Golden Member
Dec 28, 2021
1,378
1,461
106
Oh come on, you really gonna claim that Computerbase is straightup lying to their readers
They weren't nearly as confident about the claim that it's PTL as you are lol
and that Intel puts out kinda polished looking slides with footnotes for internal use?
Intel communicates with external partners about perf projections for future products long before just ~1 year from launch.
NVL is barely taped out yet and they are doing performance claims?
Completely normal. We got even better looking slides and perf claims from ARL-S igor leak an year away from launch, remember?
Instead it fits kinda good to PTL.
How exactly is PTL going to get >10% ST gain than ARL-H, or tbf even LNL too?
And why would they compare PTL-H vs LNL?

Besides, it's not just Kepler who think this. Uzzi repeated similar claims on reddit.
 
  • Like
Reactions: Joe NYC

Io Magnesso

Senior member
Jun 12, 2025
560
148
71
They weren't nearly as confident about the claim that it's PTL as you are lol

Intel communicates with external partners about perf projections for future products long before just ~1 year from launch.

Completely normal. We got even better looking slides and perf claims from ARL-S igor leak an year away from launch, remember?

How exactly is PTL going to get >10% ST gain than ARL-H, or tbf even LNL too?
And why would they compare PTL-H vs LNL?

Besides, it's not just Kepler who think this. Uzzi repeated similar claims on reddit.
Well, even assuming 10% of Panther Lake's performance, the performance is quite high.
Performance is actually degraded
 

Geddagod

Golden Member
Dec 28, 2021
1,378
1,461
106
The e-core by 2028 should gain IPC by at least 40%.

Arctic Wolf - 20% IPC gain 2026
Golden Eagle - 20% IPC gain 2027
I hope Arctic Wolf gets that much of an IPC uplift (would be hilarious too since atp th e-cores and p-cores would be like what, within 5-10% of eachother in IPC), but I doubt they do.
If one of Arctic Wolf's major goals is being able to implement AVX-512, I think a lot of the transistor and engineering budget would go into making that happen.
We have seen how much of an area impact buffing the FPU like this could make. AMD's Zen 2 FPU changes, and AMD's Zen 5 desktop vs Zen 5 mobile FPU changes, both double the area of the FPU- and those aren't even increasing vector width...
Ik Raichu has spit balled that number, but I find it personally hard to believe.
 

Kepler_L2

Senior member
Sep 6, 2020
890
3,629
136
I hope Arctic Wolf gets that much of an IPC uplift (would be hilarious too since atp th e-cores and p-cores would be like what, within 5-10% of eachother in IPC), but I doubt they do.
If one of Arctic Wolf's major goals is being able to implement AVX-512, I think a lot of the transistor and engineering budget would go into making that happen.
We have seen how much of an area impact buffing the FPU like this could make. AMD's Zen 2 FPU changes, and AMD's Zen 5 desktop vs Zen 5 mobile FPU changes, both double the area of the FPU- and those aren't even increasing vector width...
Ik Raichu has spit balled that number, but I find it personally hard to believe.
Specially with how small the cores are.
 

DavidC1

Golden Member
Dec 29, 2023
1,650
2,704
96
I hope Arctic Wolf gets that much of an IPC uplift (would be hilarious too since atp th e-cores and p-cores would be like what, within 5-10% of eachother in IPC), but I doubt they do.
It would be greatly off the trend, since they maintained even greater 30% gain every generation, while keeping area/power increase at a linear gain.

Even though they haven't talked about it, it's pretty clear to me what their goals are:
-Linear xtor increase in respect to performance
-Switching between big expansions and new innovations

It makes no sense they would add a fourth cluster to get ~10% or something in the gutter, because Darkmont is changing trivially to get 3-4% improvements. Also because their cores have been loosely following ARM cores and they are clocked in the similar range, and ARM cores are massively faster, meaning gap can be closed.
Specially with how small the cores are.
Knights Landing implemented full-width AVX-512 quite area efficiently. It was way smaller than the one in Skylake. It was small enough that I speculated area-wise they could make an GPU out of such and the Silvermont+AVX cores could replace HD Graphics of that generation.
 

Geddagod

Golden Member
Dec 28, 2021
1,378
1,461
106
It would be greatly off the trend, since they maintained even greater 30% gain every generation, while keeping area/power increase at a linear gain.
Haven't they remained at 128 bit since forever too though?
And it's also much easier to scale up when you are starting at a much lower point.
I also am not confident in the whole area/power at a linear gain point either. I wouldn't be surprised if Crestmont ported over to N3 would be outright better PPA and power at the lower end of the curve.
Also because their cores have been loosely following ARM cores and they are clocked in the similar range, and ARM cores are massively faster, meaning gap can be closed.
ARM cores aren't massively faster. David Huang has a ~15% difference between a P-core in the SD8E and a 265k E-core.
Maybe Arctic Wolf is that large of an IPC uplift though, I don't think it's impossible. I just find it hard to believe.
I do think Intel's E-cores area advantage is going to take a large hit though.
Knights Landing implemented full-width AVX-512 quite area efficiently. It was way smaller than the one in Skylake. It was small enough that I speculated area-wise they could make an GPU out of such and the Silvermont+AVX cores could replace HD Graphics of that generation.
AVX-512 was 40% of the core area in Knights Landing.
 

DavidC1

Golden Member
Dec 29, 2023
1,650
2,704
96
I wouldn't be surprised if Crestmont ported over to N3 would be outright better PPA and power at the lower end of the curve.
Of course it is. All cores function like that. But at the high end of the curve it's linear in respects to power/area.
ARM cores aren't massively faster. David Huang has a ~15% difference between a P-core in the SD8E and a 265k E-core.
Maybe Arctic Wolf is that large of an IPC uplift though, I don't think it's impossible. I just find it hard to believe.
I do think Intel's E-cores area advantage is going to take a large hit though.
It's established that Skymont is around X2/X3. X925 is quite a bit faster while still fitting into phones. Of course we can't ignore Apple either.
AVX-512 was 40% of the core area in Knights Landing.
Silvermont was also a teeny tiny core. The 14nm version proved that they could be ARM equivalent in terms of area and performance.

Here's my analysis for 14nm Airmont: https://forums.anandtech.com/thread...a15-jaguar-atom-haswell.2294334/post-37319343

Also from Hiroshige Goto:

0.85mm2 area in Intel's 14nm process. I found my analysis a while ago on Goldmont and it was 1.1mm2. In Goldmont they made FP out of order and also fully pipelined so it can sustain 2x DP FP instructions.
Tremont cores are 0.85mm2 while only the AVX2/FP units are 0.62mm2 in Sunny Cove.

The AVX-512/FP block on the 14nm Knights Landing chip is 1.2mm. Meaning if it only shrinks by half on 10nm, we get to 0.6mm2, which is the same size as the one in Sunny Cove, but with twice the width.
In Grace/Sky it grew above 30%, but it also had massive FP capabilities first with FMA, and second with literal doubling of FP units, while brought massive 60%+ FP gains on Skymont. If you consider that FP block takes about 25% area, then the area increase is basically 1.3x uarch + doubled FP block.
 
Last edited:

Doug S

Diamond Member
Feb 8, 2020
3,309
5,753
136
I hope Arctic Wolf gets that much of an IPC uplift (would be hilarious too since atp th e-cores and p-cores would be like what, within 5-10% of eachother in IPC), but I doubt they do.

Hilarious why? Am I missing something or isn't the "unified core" thing just an admission from Intel than the E cores will catch up to the P cores in a few years so they're dropping the P core?

If that's the case don't we all have to wonder why? Is the P core team just a bad team? Is the E core team just a great team? Or is the P core team not allowed to start with a "clean sheet" but just iterating the previous designs, so it is basically the silicon equivalent of unmaintainable spaghetti code at this point while the E core is the silicon equivalent of a relatively recent fresh from scratch code rewrite?
 

Kepler_L2

Senior member
Sep 6, 2020
890
3,629
136
If that's the case don't we all have to wonder why? Is the P core team just a bad team? Is the E core team just a great team? Or is the P core team not allowed to start with a "clean sheet" but just iterating the previous designs, so it is basically the silicon equivalent of unmaintainable spaghetti code at this point while the E core is the silicon equivalent of a relatively recent fresh from scratch code rewrite?
AFAIK the P-core team is mostly old engineers, who seem afraid of trying new things and just stick with "tried and true" methods, while the E-core is much younger on average and are willing to take risks and do things differently.
 
  • Like
Reactions: Joe NYC

DavidC1

Golden Member
Dec 29, 2023
1,650
2,704
96
And it's also much easier to scale up when you are starting at a much lower point.
This is too easy of an explanation.

The P cores never had xtor efficient increases. Not since Pentium M. It was 50% increase at 15-20% perf. In Nehalem we didn't even get that in single thread.

In contrast, every single E core has been a targeted increase, carefully considering all aspects to improve efficiently. Note what C&C interview said about Skymont:
-"Why go with the 3 by 3 decode cluster?” To which Stephen said, “It was a statistical bet. And while three 3-wide decoders is a little bit more expensive in terms of the number of transistors then a two by 4-wide decode setup but, it better fits the x86 ISA."
-"Skymont duplicates microcode for the most common complex instructions across all three clusters, letting them handle those instructions without blocking each other."
Technically they could have just done FastPath for such instructions like they did for P cores.

They have not. Why? I could speculate on this. Because if previously your instructions were 0.25x the speed of normal instructions, then suddenly you don't need them to be 10x as fast. FastPath is power hungry.

So rather than that you speed them up to be on par with regular instructions. Targetted, with little/no waste, taking advantage of what's already there.
-"Skymont widens the retirement stage from 8 to 16 micro-ops per cycle, which feels unbalanced because micro-ops can leave the backend twice as fast as they can enter it. But Intel found they could make various buffers, queues, and register files a bit smaller if they could free up entries in those structures faster. Overall, overbuilding the retirement stage was cheaper than adding more reordering capacity."
Different way of thinking to improve efficiently.
-"Intel explains that by saying that dedicated functionality on each port is better for energy efficiency."
In Gracemont, they removed a feature that they've been using for 3 generations called L2 predecode cache. A potentially risky move, but if it works out then you increase efficiency. Gracemont's end result shows they have achieved it with a new feature called OD-ILD.

So in the predecessor just two years ago they introduced a novel new feature that solves x86's decode problems, but in the successor it introduces another big feature while removing one that worked for three gens.
 
Last edited:
  • Like
Reactions: 511

Joe NYC

Diamond Member
Jun 26, 2021
3,260
4,746
136
No one invited me to this thread nooooo

This is nonsense Xeon 7 is not using Lion Cove it's using Panther Cove without HT and DMR is a solid upgrade over GNR not to mention we have Rouge River Forest based on Arctic Wolf with at least 288 Cores APX/AVX 10.

I am a little confused here: Isn't Xeon 7 P Core called Diamond Rapids (DMR)? Are you saying Diamond Rapids will not have HT?
 

DavidC1

Golden Member
Dec 29, 2023
1,650
2,704
96
AFAIK the P-core team is mostly old engineers, who seem afraid of trying new things and just stick with "tried and true" methods, while the E-core is much younger on average and are willing to take risks and do things differently.
That wasn't true when they started Pentium M.

I liked Exist's explanation: The E core team had to execute because they were under constant threat of extinction if they did not. Also, they were constrained, which often brings out innovation among us humans. You have to be power efficient, you have to be area efficient. Basically, the constraints given are the goals that forces the engineers to do better than they could otherwise.

The P core team aside from politics had egos reaching into the stratosphere. If you believe you are at the top, then you aren't so motivated to advance.
the p core is just sad at this point.... Skymont is great because because the area being so small. I have high hopes for E core in nova lake
Such tests also favor the higher power cores because the overhead is relatively less.
 
  • Like
Reactions: Joe NYC

Joe NYC

Diamond Member
Jun 26, 2021
3,260
4,746
136
Hilarious why? Am I missing something or isn't the "unified core" thing just an admission from Intel than the E cores will catch up to the P cores in a few years so they're dropping the P core?

That's what my read would be, that E-Core team will catch up in IPC, features, but at lower power and smaller die size.
 

Geddagod

Golden Member
Dec 28, 2021
1,378
1,461
106
This is too easy of an explanation.
It's a way too easy of an explanation as to why I don't think Arctic Wolf is going to be a 30% uplift over Skymont?
Hilarious why? Am I missing something or isn't the "unified core" thing just an admission from Intel than the E cores will catch up to the P cores in a few years so they're dropping the P core?
While the E-core team is supposed to be the team in charge of unified core, it would be surprising for the E-cores to esentially already be caught up in IPC before that even.
I am a little confused here: Isn't Xeon 7 P Core called Diamond Rapids (DMR)?
That's the chip name, not the core name
Are you saying Diamond Rapids will not have HT?
Would be surprised it it doesn't have SMT, though that seems to be the case.