Discussion Intel current and future Lakes & Rapids thread

Page 746 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

moinmoin

Diamond Member
Jun 1, 2017
4,944
7,656
136
Notice the lack of mention of Raptor Lake mobile there btw. It's an important detail.
Does that even exist? I thought it's all ADL rebadged as 13th gen, except for the high end DTR HX series using the one RPL desktop K die.
 

uzzi38

Platinum Member
Oct 16, 2019
2,624
5,894
146
Does that even exist? I thought it's all ADL rebadged as 13th gen, except for the high end DTR HX series using the one RPL desktop K die.
It is pretty much rebadged ADL, yes.

But it also runs higher peak clocks and I believe is also more efficient due to process improvements.
 

Timmah!

Golden Member
Jul 24, 2010
1,418
630
136
Last edited:

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Branch prediction improves nearly every generation doesn't it? And usually branch prediction changes don't add drastically much IPC anyway right?

On the contrary. Branch prediction is pretty much a necessity because it's one of the biggest bottlenecks to instruction level parallelism, and one of the only ways to constantly increase single thread performance. Without consistent work on branch prediction over the past 20 years, the performance we get with CPUs nowadays wouldn't be possible. Without improved branch prediction, most of the other features would be rendered null. It allows modern 6+ wide decoder setups to be usable.

Remember the reason much criticized architectures like Netburst(Pentium 4) and AMD's Bulldozer performed badly was because pipeline increase was more than what could be countered by branch predictor improvements.

It's increasing cache that's very easy to do, even though internal caches like L1 is harder than increasing L3.

Also, branch predictor enhancements require significant work, because it's actually lot of thought required rather than just increasing the size of the buffers, even though it's part of it. It is the importance in performance that any significant uarch changes accompany branch prediction improvements.

Biggest reason for Load/Store changes lies in continual work in vector performance. It benefits general purpose code but at some point it's more for wider AVX. You'll notice every time SIMD width doubles, so do L/S.
 
Last edited:

nicalandia

Diamond Member
Jan 10, 2019
3,330
5,281
136
Why only 40 numbers? Thats neither number of cores nor threads.
Additionally, the numbers all over place. So what is the all-core clock out of these, is it 3,8 or 3,9 or 4 or 4,1? Lets see the final product and its pricing. If its 2500 like TR 5965x, then lol.
It's the amount of times The benchmark detected the Single Thread Highest Speed per core. It does not always jump on all cores but it jumped 40 times(at least delectated).


All Core boost is much lower than the recorded ST Core speed. GB5 Only reports Max core speeds and those are always ST on stock Settings
 
  • Like
Reactions: Joe NYC

Timmah!

Golden Member
Jul 24, 2010
1,418
630
136
It's the amount of times The benchmark detected the Single Thread Highest Speed per core. It does not always jump on all cores but it jumped 40 times(at least delectated).


All Core boost is much lower than the recorded ST Core speed. GB5 Only reports Max core speeds and those are always ST on stock Settings

I am aware MT clock will be lower than ST clock, but ST clock is supposed to be 4,6/4,8. That much we know from the slides and thats not the clocks we are seeing here either.
MT clocks will be very likely around 3,8GHz, but maybe could be more, thats what i am curious about.
 

nicalandia

Diamond Member
Jan 10, 2019
3,330
5,281
136
I am aware MT clock will be lower than ST clock, but ST clock is supposed to be 4,6/4,8. That much we know from the slides and thats not the clocks we are seeing here either.
MT clocks will be very likely around 3,8GHz, but maybe could be more, thats what i am curious about.
True, The ST should be about 15% Higher but MT will remain about the same

1676041800391.png
 

JoeRambo

Golden Member
Jun 13, 2013
1,814
2,105
136
It won't have good ST performance no matter what. 4 channel DDR5 => penalty in latency, L3 cache with mesh architecture penalty makes already bad situation worse. The size of said L3 cache is also anemic and can't really feed so many P cores.
This is not enthusiast, but rather specialty platform as all it has is: AVX512 combined with proper I/O options for workstation. So anyone buying these already considered 7950x ( that has AVX512 ) and decent ST/MT perf and 13900K that has even better ST performance and found both lacking.
 

Geddagod

Golden Member
Dec 28, 2021
1,149
1,007
106
On the contrary. Branch prediction is pretty much a necessity because it's one of the biggest bottlenecks to instruction level parallelism, and one of the only ways to constantly increase single thread performance. Without consistent work on branch prediction over the past 20 years, the performance we get with CPUs nowadays wouldn't be possible. Without improved branch prediction, most of the other features would be rendered null. It allows modern 6+ wide decoder setups to be usable.

Remember the reason much criticized architectures like Netburst(Pentium 4) and AMD's Bulldozer performed badly was because pipeline increase was more than what could be countered by branch predictor improvements.

It's increasing cache that's very easy to do, even though internal caches like L1 is harder than increasing L3.

Also, branch predictor enhancements require significant work, because it's actually lot of thought required rather than just increasing the size of the buffers, even though it's part of it. It is the importance in performance that any significant uarch changes accompany branch prediction improvements.

Biggest reason for Load/Store changes lies in continual work in vector performance. It benefits general purpose code but at some point it's more for wider AVX. You'll notice every time SIMD width doubles, so do L/S.
While Intel doesn't often release IPC breakdowns of their new architectures, AMD does. And based on Zen 3 and Zen 4, branch prediction changes really didn't cause a major change in IPC compared to other changes in the architecture. And I swear Intel claims better branch prediction every gen, prob cuz they tweak it a bit every gen, didnt they claim it for RPL as well?
 

nicalandia

Diamond Member
Jan 10, 2019
3,330
5,281
136
It won't have good ST performance no matter what. 4 channel DDR5 => penalty in latency, L3 cache with mesh architecture penalty makes already bad situation worse. The size of said L3 cache is also anemic and can't really feed so many P cores.

I agree, Quad Channel, Low per Core L3$(1.875 MiB Per Core) and Mesh Ring are not so great.
 

Timmah!

Golden Member
Jul 24, 2010
1,418
630
136
It won't have good ST performance no matter what. 4 channel DDR5 => penalty in latency, L3 cache with mesh architecture penalty makes already bad situation worse. The size of said L3 cache is also anemic and can't really feed so many P cores.
This is not enthusiast, but rather specialty platform as all it has is: AVX512 combined with proper I/O options for workstation. So anyone buying these already considered 7950x ( that has AVX512 ) and decent ST/MT perf and 13900K that has even better ST performance and found both lacking.

Its only funny that it will be probably 3x - 5x more expensive than either 13900k/7950x, basically only because of that I/O. Which you already pay for in motherboard price.
 

Exist50

Platinum Member
Aug 18, 2016
2,445
3,043
136
Does that even exist? I thought it's all ADL rebadged as 13th gen, except for the high end DTR HX series using the one RPL desktop K die.
There appears to be a legitimately new die for Raptor Lake mobile, but without the L2 increase we see in desktop Raptor Cove. It's kinda baffling. Maybe they were pinning their hopes on getting DLVR working?
 

BorisTheBlade82

Senior member
May 1, 2020
663
1,014
106
Intel charges too much to get into consoles.
Well, Intel was in the first XBox. And if they should need some utilisation for their Fabs - which seems quite likely ATM - then why not?
I wouldn't say that they are too expensive by definition.

Branch prediction improves nearly every generation doesn't it? And usually branch prediction changes don't add drastically much IPC anyway right?
I don't really recall any leaker talking about increased OOO buffers anywhere though.
The 'wild card' I would argue is how a potentially increased L1 instruction cache as well as potential better Load/Store impact performance. And maybe there will be some touch ups on integer execution based on how that did not shrink very well compared to the rest of the core, and GLC focused heavily on the FP side.
AMD explicitly states the IPC improvement contributors since at least Zen3. This one is for Zen 4.
19190358-1075e24ffdd25e0ceb82942.jpg

As @IntelUser2000 already mentioned: In addition to direct gains the Branch predictor also acts as an enabler. And it also has a huge part in minimizing wasted processing and therefore improving efficiency.