Question Alder Lake - Official Thread

Hulk · Oct 28, 2021

With the release of Alder Lake less than a week away and the "Lakes" thread having turned into a nightmare to navigate I thought it might be a good time to start a discussion thread solely for Alder Lake.

TheELF · Oct 31, 2021

DrMrLordX said:
It affects performance in any kind of branchy code where the predictors can't maintain high hit rates. If the Gracemont cores have to snag something from L3 outside of their cluster or from main memory, that latency is going to sting you.

Are we still talking about games?!
Because games pre render and pre do everything that is possible to do so, the only real-time thread is the user input and everything else is being slowed down or paused to match the speed of that one thread.
Unless it gets screwed up somehow and everything happens whenever, like parts of graphics showing up later or not at all (assassins creed horror face bug) , or you get heavy stutters because the pausing is way too long.

DrMrLordX · Oct 31, 2021

TheELF said:
Are we still talking about games?!

Yes. Absolutely yes.

Because games pre render and pre do everything

lol

no

SKORPI0 · Oct 31, 2021

Not sure if I'll upgrade to the 12th Gen Intel. Price of DDR5 of $375 for only 32GB is ridiculous (understandably since they are new) , I'll wait when it goes a lot lower.
VENGEANCE® 64GB (2x32GB) DDR5 DRAM 4400MHz C36 Memory is currently priced at $615.

VENGEANCE® 64GB (2x32GB) DDR5 DRAM 4400MHz C36 Memory Kit — Black

CORSAIR VENGEANCE DDR5, optimized for Intel® motherboards, delivers higher frequencies and greater capacities of DDR5 technology in a high-quality, compact module that suits your system.

www.corsair.com

Paid $275 for Corsair Vengeance RGB Pro 64GB DDR4-3600 early Sept. last year, added another 64GB a few weeks later for about the same price. It's about $470 now at Amazon.
I've barely used my 11th Gen (4 month old) system for video encoding/remuxing. I'm pissed that GeForce RTX 30 series/AMD RX 6000 series GPU are hard to get at MSRP, I'm not paying for markups at eBay, etc,
Hopefully the RTX 40 series will be more available late 2022/early 2023. .

maddie · Oct 31, 2021

TheELF said:
Are we still talking about games?!
Because games pre render and pre do everything that is possible to do so, the only real-time thread is the user input and everything else is being slowed down or paused to match the speed of that one thread.
Unless it gets screwed up somehow and everything happens whenever, like parts of graphics showing up later or not at all (assassins creed horror face bug) , or you get heavy stutters because the pausing is way too long.

WHAAAT?

dmens · Oct 31, 2021

TheELF said:
Because games pre render and pre do everything that is possible to do so, the only real-time thread is the user input and everything else is being slowed down or paused to match the speed of that one thread.

So games are actually user bound? Guess we need slower GPUs.

TheELF · Nov 1, 2021

dmens said:
So games are actually user bound? Guess we need slower GPUs.

Games are bound to how fast the main loop can run and that main loop starts with "see what user does" (poll user input)

NostaSeronx · Nov 1, 2021

There hasn't been much benchmarks for Goldmont Plus(atom-class Branch Predictor) vs Tremont(core-class Branch Predictor) or any benchmarks checking out mispredictions of late.

Agner Fog branch bottleneck text (condensed):
Silvermont - The prediction rate is fair.
Goldmont - The prediction rate is fair.
Nehalem - The branch prediction algorithm is good, especially for loops.
Sandy/Ivy - Worse than Nehalem for some cases.
Haswell/Broadwell - The size of the branch target buffer and the construction of the branch predictor is unknown, but at least the prediction rate seems good.
Skylakes - Same as above.
Icelake/Tigerlake - The branch prediction mechanism is quite complicated and allegedly improved over previous models. Same as above.

RWT VTune and CodeAnalyst of core2 vs k8 has Core2 with ~95% prediction hits in Far Cry, Fear, Prey, X3. "Both the Core 2 and K8 tend to encounter a branch every 5-7 instructions. Since each processor has an instruction window of ~100 instructions, this means that on average, there will be as many as 20 predicted branches in-flight at once."

There is also the case of fast recovery which was introduced with Nehalem, so no flushed pipeline is needed.

Cost for GoldenCove - 6-way decode front-end power consumption + recovery from op-cache(2-cycle recovery???)
Cost for Gracemont - 3-way or 3-way decode front-end power consumption + recovery from L1i(6-cycle recovery? or 5-cycle recovery if OD-ILD Bypassed???)

A particularly bad misprediction chain should waste more power on GoldenCove rather than Gracemont.
Using a rather old Intel paper on micro-op caches and me potentially misinterpreting percentages and what not.... misprediction recovery from op-cache should be ~2.6992 watts(40%? of 28% of processor avg power) for GoldenCove, and ~1.687 watts(28% of processor avg power) for L1i recovery for Gracemont.

So, while it hurts on performance-side w/o a sibling thread to take over, it hurts less on the power-side. Less power wasted is more ideal in Alder Lake N lineup.

On another note, I hope we get an Alder Lake big N with 24e cores(24 threads in p+e, 24 threads in just e). That way we can really compare hybrid architecture vs pure low-power w/ extreme width DVFS/AVFS architecture(AMD-esque). [[8+8 / 0+24 / 6+0 being the big boy dies]]

Harry_Wild · Nov 1, 2021

Just wait till Intel moves on from 10nm to 7nm, it going be rocking again. Intel swallowed it pride and let TSMC do the manufacturing now, just like Apple. Intel had factories in Mexico and gave up on most of the factories in the U.S. around 10 years ago and lost it edge! Maybe Intel will start back up manufacturing CPU again in the U.S. again now!

Markfw · Nov 1, 2021

Harry_Wild said:
Just wait till Intel moves on from 10nm to 7nm, it going be rocking again. Intel swallowed it pride and let TSMC do the manufacturing now, just like Apple. Intel had factories in Mexico and gave up on most of the factories in the U.S. around 10 years ago and lost it edge! Maybe Intel will start back up manufacturing CPU again in the U.S. again now!

Its been producing them in Hillsboro, Oregon for quite some time now. YEARS.

In fact 6 plants, the most recent starting in 2013

List of Intel manufacturing sites - Wikipedia

en.wikipedia.org

UsandThem · Nov 1, 2021

Markfw said:
Its been producing them in Hillsboro, Oregon for quite some time now. YEARS.

In fact 6 plants, the most recent starting in 2013

List of Intel manufacturing sites - Wikipedia

en.wikipedia.org

Don't take their bait, Mark.

After seeing this response and looking at their past posts, it's obvious why they came back here after such a long break.

https://forums.anandtech.com/thread...ith-an-intel-i5-12600f.2598502/#post-40621804

DrMrLordX · Nov 1, 2021

TheELF said:
Games are bound to how fast the main loop can run and that main loop starts with "see what user does" (poll user input)

Congratulations

Now you understand why games respond so well to fast memory and large caches. And why they are hurt so much by intercore latency and/or memory/cache latency.

phillyman36 · Nov 1, 2021

Waiting for cpu and mobo. I got the ram from Corsair. Newegg sent me 2 emails saying my 12900k preorder is delayed but didnt say how long. May have to hit Microcenter Thursday morning.

Bouowmx · Nov 1, 2021

It's not a true DDR5 overclock investigation but testing a 16 GB module (I think) from stock 4800 MT/s, CL40
Looks a bit lacking, even with 1.5 VDDQ

jpiniero · Nov 1, 2021

Bouowmx said:
It's not a true DDR5 overclock investigation but testing a 16 GB module (I think) from stock 4800 MT/s, CL40
Looks a bit lacking, even with 1.5 VDDQ

View attachment 52204

I wouldn't read too much into a power consumption graph comparing performance.

Zucker2k · Nov 2, 2021

G-Skill announces DDR5 7000MHz CL40 RAM

G.SKILL shows off its DDR5-7000 CL40 overclocked memory - VideoCardz.com

G.SKILL Showcases DDR5-7000 CL40 Extreme Speed Memory (2 November 2021) – G.SKILL International Enterprise Co., Ltd., the world’s leading manufacturer of extreme performance memory and gaming peripherals, is thrilled to announce the achievement of DDR5-7000 CL40-40-40-76 32GB (2x16GB) extreme...

videocardz.com

Intel's XMP 3.0 goes up to DDR5 6666MHz CL40

Bouowmx · Nov 2, 2021

From the official Intel XMP 3.0 DDR5 list: https://www.intel.com/content/www/us/en/gaming/xmp-3-for-core-processors.html
Might be a Micron vs Samsung thing? Some need 1.25 V for 5200 MT/s, but G.SKILL shows off 1.1 V.

Zucker2k · Nov 2, 2021

i7 12700kf overshadows the R9 5900x in CB R23...

Cinebench R23 (Multi-Core) CPU benchmark list

www.cpu-monkey.com

Partial comparison on CPU-Monkey, i5 12600k vs R7 5800x...

AMD Ryzen 7 5800X vs Intel Core i5-12600K Benchmark, comparison and differences

Benchmark, test, review, comparison and differences between these CPUs in Cinebench 23 and Geekbench 5

www.cpu-monkey.com

lightmanek · Nov 2, 2021

Zucker2k said:
i7 12700kf overshadows the R9 5900x in CB R23...

Cinebench R23 (Multi-Core) CPU benchmark list

Cinebench R23 (Multi-Core) CPU benchmark list

www.cpu-monkey.com

Partial comparison on CPU-Monkey, i5 12600k vs R7 5800x...

AMD Ryzen 7 5800X vs Intel Core i5-12600K Benchmark, comparison and differences

Benchmark, test, review, comparison and differences between these CPUs in Cinebench 23 and Geekbench 5

www.cpu-monkey.com

Good, also beats i9-7960X core for core.

Asterox · Nov 2, 2021

Very nice comparison, Alder Lake Cinebench R20 performanse per watt........................

https://twitter.com/x/status/1455550767062913030

TheELF · Nov 2, 2021

Asterox said:
Very nice comparison, Alder Lake Cinebench R20 performanse per watt........................

https://twitter.com/x/status/1455550767062913030

Damn, so compared to rocket AL is ~50% faster at 125W and ~70% faster at 241W compared to 251W, and mostly at the same price, for the CPU alone anyway.
That's all they need to sell them, actually it's way way more than they would need to sell them.

But these numbers are collected from various sources so not really 100% for certain.

DrMrLordX · Nov 2, 2021

@phillyman36

Looking good so far. Let us know how it goes!

Ajay · Nov 2, 2021

TheELF said:
Games are bound to how fast the main loop can run and that main loop starts with "see what user does" (poll user input)

Games don't poll anything anymore. User inputs are handled by the OS. The gaming application subscribes via a particular API to receive these inputs (probably a callback, but there are other ways). The gaming applications receives a notification of ready input asynchronously (on another thread). The game code decides what needs to be done and then sends a set of instructions to game's rending engine to update itself to show the user any necessary changes. Roughly.

Hitman928 · Nov 2, 2021

TheELF said:
Damn, so compared to rocket AL is ~50% faster at 125W and ~70% faster at 241W compared to 251W, and mostly at the same price, for the CPU alone anyway.
That's all they need to sell them, actually it's way way more than they would need to sell them.

But these numbers are collected from various sources so not really 100% for certain.

It's 70% faster at PL2=PL1=241W for ADL versus RKL which is at PL2=251W and PL1=125W. So the ADL is 70% faster in this comparison but also using more average power across the run (unless tau was set to unlimited or long enough to complete the bench on RKL but then why even list the PL1 on RKL?). According to Intel, ADL is 50% faster than RKL when both are at ~240W so it seems it's ~50% faster at equal power. Maybe low power situations are different though.

TheELF · Nov 2, 2021

DrMrLordX said:
Congratulations

Now you understand why games respond so well to fast memory and large caches. And why they are hurt so much by intercore latency and/or memory/cache latency.

The only data point we would have for that would be canned benchmarks because that's all that reviewers give us and those don't use any user input.
Canned benches can fill up all of their cache with upcoming stuff that the game wouldn't know about or would have to guess on if you, the user, actually move around in it randomly.

Ajay said:
Games don't poll anything anymore. User inputs are handled by the OS. The gaming application subscribes via a particular API to receive these inputs (probably a callback, but there are other ways). The gaming applications receives a notification of ready input asynchronously (on another thread). The game code decides what needs to be done and then sends a set of instructions to game's rending engine to update itself to show the user any necessary changes. Roughly.

I would look up what the word poll means for devs if I would care enough but I doubt that it's exclusively used for assembly level hardware poking (pulling/pushing, whatever it would be) .
The whole point was that the game checks for user input, indifferent from how this is accomplished.

DrMrLordX · Nov 2, 2021

TheELF said:
The only data point we would have for that would be canned benchmarks because that's all that reviewers give us and those don't use any user input.

What are you talking about? Memory latency, cache latency, and intercore latency (which affects the first two) have affected game benchmark performance for years, even in benchmarks that are not "canned".

Question Alder Lake - Official Thread

Diamond Member

Diamond Member

Lifer

Lifer

Diamond Member

Platinum Member

Diamond Member

Diamond Member

Senior member

Moderator Emeritus, Elite Member

Elite Member

Lifer

Golden Member

Golden Member

Lifer

Golden Member

Golden Member

Golden Member

Senior member

Golden Member

Diamond Member

Lifer

Lifer

Diamond Member

Diamond Member

Lifer