Discussion Intel current and future Lakes & Rapids thread

Page 481 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

dullard

Elite Member
May 21, 2001
24,998
3,326
126
If the 30% single thread improvement is true then what's up with this picture? "up to 20% single thread performance" Fake or something to mislead the competition?
Neither. 20% more IPC (from the image), and slightly faster frequencies, and more time in turbo => 30% faster (in some programs).
 

Racan

Golden Member
Sep 22, 2012
1,101
1,969
136
According to that slide it's (up to) +20% single threaded performance, not IPC. That means higher frequencies are already included there.

That's what I assume as well from the wording and I don't see how you could interpret it as an average either, because "up to" indicates a limit.
 
  • Like
Reactions: Lodix

dullard

Elite Member
May 21, 2001
24,998
3,326
126
According to that slide it's (up to) +20% single threaded performance, not IPC. That means higher frequencies are already included there.
Your version assumes Intel knows the final production clock speeds and turbo behavior in various systems 6 to 9 months before launch.
 

mikk

Diamond Member
May 15, 2012
4,112
2,108
136
Intel wasn't always that accurate in the past, UHD 750 up to 50% was an average increase in the end and surely not a best case. But yes this slide is quite old by now, maybe they expected 5.0 instead of 5.3 Ghz. Some people are giving to much attention to this slide.
 

Hulk

Diamond Member
Oct 9, 1999
4,191
1,975
136
Perhaps these ADL scores have been fine-tuned in software somehow? Let me elaborate.

There has been a lot of discussion throughout this thread regarding the complexity or lack there of in tuning the Windows Scheduler to handle heterogeneous cores.

Perhaps this Cinebench score was achieved by allocating the optimum cores to threads in Cinebench. This might not be something accomplished by the Scheduler but more of a "hand tuning" written into the OS, benchmark software, or other utility? Just a thought.

If so, is this cheating? Perhaps, perhaps not. If this sort of hand tuning can be built into other software and it is transparent to the end user then it might be fair as long as the software you need incorporates said tuning.

The point is that benchmarks are important to both AMD and Intel and generally hand tuning is better than machine optimized "one size fits" all scheduling.
 

jpiniero

Lifer
Oct 1, 2010
14,510
5,159
136
Not calling it overclock is still... Well, let's call it further progress in Intel's very own definition of TDP. 🤷

Welcome to Turbo Boost 2.0. It's almost 10 years old. It's probally been mentioned like 100 times in this thead so far, the TDP is for the base clock and not the all core turbo.
 
  • Like
Reactions: pcp7

gdansk

Golden Member
Feb 8, 2011
1,979
2,355
136
Perhaps these ADL scores have been fine-tuned in software somehow? Let me elaborate.

There has been a lot of discussion throughout this thread regarding the complexity or lack there of in tuning the Windows Scheduler to handle heterogeneous cores.

Perhaps this Cinebench score was achieved by allocating the optimum cores to threads in Cinebench. This might not be something accomplished by the Scheduler but more of a "hand tuning" written into the OS, benchmark software, or other utility? Just a thought.

If so, is this cheating? Perhaps, perhaps not. If this sort of hand tuning can be built into other software and it is transparent to the end user then it might be fair as long as the software you need incorporates said tuning.

The point is that benchmarks are important to both AMD and Intel and generally hand tuning is better than machine optimized "one size fits" all scheduling.
In Cinebench? I doubt it. Heterogeneity-aware scheduling mainly applies when there are threads with varying degrees of priority and workloads. When running Cinebench, it's nearly the same workload on all cores for the duration of the run. In this case Windows has 24 very busy threads and Alder Lake has 24 'cores' to work on them.

You may see higher scores with throughput-priority schedulers and lower scores with latency-priority schedulers but I consider that boost behavior will trump any gains from scheduling when the CPU is saturated with available work.
 
Last edited:

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
None of the boost and TDP arguments take into account the Gracemont portion.

40-45% performance improvement per clock over Tremont means basically you get Core i5 Icelake out of the "Atoms" that'll succeed Jasper Lake.

Atom becoming an Atom Bomb.
 
  • Like
Reactions: lightmanek

Timorous

Golden Member
Oct 27, 2008
1,532
2,535
136
If the 30% single thread improvement is true then what's up with this picture? "up to 20% single thread performance" Fake or something to mislead the competition?

Intel-Alder-Lake-S-Specifications.jpg

Could be based on spec results rather than cinebench.
 

Zucker2k

Golden Member
Feb 15, 2006
1,810
1,159
136
There has been a lot of discussion throughout this thread regarding the complexity or lack there of in tuning the Windows Scheduler to handle heterogeneous cores.

Perhaps this Cinebench score was achieved by allocating the optimum cores to threads in Cinebench. This might not be something accomplished by the Scheduler but more of a "hand tuning" written into the OS, benchmark software, or other utility? Just a thought.

If so, is this cheating?
Unbelievable?

For your scheduling dilemma, look no further than the slide from the quoted below: Hardware-Guided Scheduling. I suppose you bought into the windows scheduler nonsense that the naysayers have propagated in this thread for so long?

If the 30% single thread improvement is true then what's up with this picture? "up to 20% single thread performance" Fake or something to mislead the competition?

Intel-Alder-Lake-S-Specifications.jpg
 

uzzi38

Platinum Member
Oct 16, 2019
2,565
5,575
146
Perhaps these ADL scores have been fine-tuned in software somehow? Let me elaborate.

There has been a lot of discussion throughout this thread regarding the complexity or lack there of in tuning the Windows Scheduler to handle heterogeneous cores.

Perhaps this Cinebench score was achieved by allocating the optimum cores to threads in Cinebench. This might not be something accomplished by the Scheduler but more of a "hand tuning" written into the OS, benchmark software, or other utility? Just a thought.

If so, is this cheating? Perhaps, perhaps not. If this sort of hand tuning can be built into other software and it is transparent to the end user then it might be fair as long as the software you need incorporates said tuning.

The point is that benchmarks are important to both AMD and Intel and generally hand tuning is better than machine optimized "one size fits" all scheduling.

What? Cinebench R20 saturates all threads with a full load, you don't have to worry about scheduling at all. You don't have to allocate anything to any specific cores at all either.
 

uzzi38

Platinum Member
Oct 16, 2019
2,565
5,575
146
Unbelievable?

For your scheduling dilemma, look no further than the slide from the quoted below: Hardware-Guided Scheduling. I suppose you bought into the windows scheduler nonsense that the naysayers have propagated in this thread for so long?

For the love of-

Hardware-guided is just a fancy way of saying "we have more performance counters that Windows can now use to get an accurate understanding of what's taking place in each core, and thus can schedule better". You still rely on the Windows scheduler at the end of the day.

You can't schedule around big.LITTLE purely in hardware. Seriously - Apple tried with the A10 and their first foray into heterogeneous architectures. They dropped it after just a single generation of use.
 

Zucker2k

Golden Member
Feb 15, 2006
1,810
1,159
136
For the love of-

Hardware-guided is just a fancy way of saying "we have more performance counters that Windows can now use to get an accurate understanding of what's taking place in each core, and thus can schedule better". You still rely on the Windows scheduler at the end of the day.

You can't schedule around big.LITTLE purely in hardware. Seriously - Apple tried with the A10 and their first foray into heterogeneous architectures. They dropped it after just a single generation of use.
Of course. I thought that was implied. This is not the same as people thinking Intel left everything at the mercy of the windows scheduler. If Intel can guide the windows scheduler to optimize loads on the different cores and even bothers to put it on a bullet point is that not worth considering? Or, we should rather chuck this quite incredible score, albeit prematurely, to software-specific tuning, and therefore "cheating."?
 

insertcarehere

Senior member
Jan 17, 2013
639
607
136
None of the boost and TDP arguments take into account the Gracemont portion.

40-45% performance improvement per clock over Tremont means basically you get Core i5 Icelake out of the "Atoms" that'll succeed Jasper Lake.

Atom becoming an Atom Bomb.

Of course, if Gracemont actually achieved that sort of IPC increase while still retaining ARM-esque area/power, the question now becomes: Why even develop the Core line further if the Atom architecture doesn't sacrifice much performance but is miles & miles more efficient?
 
  • Like
Reactions: maddie and Tlh97

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
Of course, if Gracemont actually achieved that sort of IPC increase while still retaining ARM-esque area/power, the question now becomes: Why even develop the Core line further if the Atom architecture doesn't sacrifice much performance but is miles & miles more efficient?

Because we still want and need maximum powah! :cool:

If Gracemont turns out to be a beast, what Intel should do is use it as a foundation to create more performant and efficient big CPU cores. There's always going to be a need for big, fast CPU cores no matter what.
 

coercitiv

Diamond Member
Jan 24, 2014
6,151
11,686
136
There's always going to be a need for big, fast CPU cores no matter what.
There will always be a Lich King!

I think people need to be reminded of the performance delta between GC and GM in ST. If the leaks are correct then Golden Cove has a ~35% PPC and ~35% clock advantage over Gracemont, leading to a ~80% ST potential performance lead for the big core. This advantage may diminish if GM proves to have untapped OC potential, but then again we need to see PPC data for a lot more workloads before we get the full picture anyway.

It would be disappointing, if the only thing that Alder Lake-S can do is match the 5950X which was launched last year.
Come on, let's give credit where it's due. IF ADL does deliver 20-30% increase in ST performance the Intel's lineup will be very competitive where it really matters. Think ADL-S 6+0 versus 5600X, or ADL-S 8+SOMETHING versus 5800X. Depending on how they addressed inter-core latency, ADL may compete well even against Zen3D. We may be headed to the best scenario possible where production capacity is the only consumer problem in 2022, not the choice of brand.

Oh well, long Q2&Q3 ahead...
 

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
It would be disappointing, if the only thing that Alder Lake-S can do is match the 5950X which was launched last year.

Well presumably, Intel has been sitting on Alder Lake for a long time now. The 10nm fiasco obviously created massive delays in their roadmap.

Golden Cove was supposed to have launched back in 2019 assuming Intel would have kept to their tick tock cadence. In the end, it all worked out for the best though I have to say. Intel's struggles with their 10nm node gave AMD a huge break and a chance to catch up and gain some market share. So now they are a genuine threat to Intel's dominance and a worthy competitor.

We the consumers end up winning in the end. :cool:
 
  • Like
Reactions: Mopetar

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
I think people need to be reminded of the performance delta between GC and GM in ST. If the leaks are correct then Golden Cove has a ~35% PPC and ~35% clock advantage over Gracemont, leading to a ~80% ST potential performance lead for the big core.

11900K gets what, 1800 points in GB5? So Golden Cove is going to get 2450? Divide that by 1.8 and Gracemont at 3.9GHz gets 1350.

That would be nipping at the heels of Cortex X1's performance per clock! I'd be happy for them if it's true. Basically little over A75 to X1 in a single year.

1600-1800 Cinebench R20 for four, SMT-less Gracemont cores being a possibility? Holy crap! Asus Pentium Gold N7005 ITX would be a really nice board!

40-45% is starting to look like Piledriver to Zen jump.

Well presumably, Intel has been sitting on Alder Lake for a long time now. The 10nm fiasco obviously created massive delays in their roadmap.

This makes sense from a Core perspective, but not Atom. Goldmont and Goldmont Plus weren't delayed, and they are both 14nm, the upside of purposely putting value cores on an older process. Goldmont Plus is late 2017, so at the earliest they would have had Tremont ready by Alderlake timeframe.

Sure some portions may have been delayed, but looks like we'll have a substantially better Atom CPU with it.

I can't say Icelake of 2017 is the same as Icelake of 2020. They likely had to change aspects of the architecture to fit with the details of the actual 10nm process.
 
Last edited:
  • Like
Reactions: Tlh97

Zucker2k

Golden Member
Feb 15, 2006
1,810
1,159
136
It would be disappointing, if the only thing that Alder Lake-S can do is match the 5950X which was launched last year.
Wow! You just erased every doubt I have for your AMD bias with this remark. The 5950x is an HEDT chip by every measure. The fact that AMD wanted to artificially maintain multithreaded superiority when Intel started releasing 8 core chips into the mainstream segment is not lost on some of us. AND, I doubt this chip is going to exhibit clear single-threaded weakness. You sound like you gave up the ghost. Relax. It's early days yet. The day @lobz types a concession is the day we'll all know for sure.
 

uzzi38

Platinum Member
Oct 16, 2019
2,565
5,575
146
Of course. I thought that was implied. This is not the same as people thinking Intel left everything at the mercy of the windows scheduler. If Intel can guide the windows scheduler to optimize loads on the different cores and even bothers to put it on a bullet point is that not worth considering? Or, we should rather chuck this quite incredible score, albeit prematurely, to software-specific tuning, and therefore "cheating."?
...it has nothing to do with "software-specific tuning" though, that's the bit you don't understand.

The problem with scheduling on heterogeneous CPUs is that you need to have an understanding as to the kind of instruction streams and priority each thread is using and best determine the core you're going to assign that thread to.

However, in the case of an application which leads the entire CPU with essentially the same workload, you don't really have to do anything at all. Just allocate each thread a core and it'll work out, regardless of which cores get assigned what threads.

There's no "cheating" or anything else, this is just the kind of workload I expect Alder Lake to have the easiest time in. R20 feels to me like a best case scenario - it's a quick benchmark that finishes within PL2, stretches out to all cores with absolute ease and doesn't even need to access main memory often. It's literally the perfect workload demo DDR5 Alder Lake's strength. What I want to know now is how the performance here extends out to other workloads.
 
  • Like
Reactions: Saylick and Tlh97

Asterox

Golden Member
May 15, 2012
1,026
1,775
136
Wow! You just erased every doubt I have for your AMD bias with this remark. The 5950x is an HEDT chip by every measure. The fact that AMD wanted to artificially maintain multithreaded superiority when Intel started releasing 8 core chips into the mainstream segment is not lost on some of us. AND, I doubt this chip is going to exhibit clear single-threaded weakness. You sound like you gave up the ghost. Relax. It's early days yet. The day @lobz types a concession is the day we'll all know for sure.

No it is not HEDT CPU.

For example, old Ryzen 7 1800X is not HEDT CPU.As we now i7 6900K is HEDT CPU, and everything else is history.

HEDT CPU has quad-channel DDR4 memory, and much more PCIE lanes vs classic Desktop CPU.



At the end, the most important is CPU price=performance.CPU Power efficiency is also very important detail.

 

Zucker2k

Golden Member
Feb 15, 2006
1,810
1,159
136
...it has nothing to do with "software-specific tuning" though, that's the bit you don't understand.
I said if he's looking for answers about scheduling, then he needs to take a look at the hardware-guided scheduling as well. Even for a throughput oriented bench like CBx, there are moments towards the end where you're left with fewer 'chunks' than there are cores/threads, so intelligent assignment could come in handy as to which cores get the last remaining bits. If there's any confusion here, I'm not the source of it because I didn't raise the scheduling question.