Discussion Intel current and future Lakes & Rapids thread

LightningZ71 · Oct 15, 2021

So, are we now demanding that reviewers only use jedec spec ddr4-3200 ram again? That's the way OEM machines will come locked to after all. It is the "official" limit by both vendors.

Zucker2k · Oct 15, 2021

LightningZ71 said:
So, are we now demanding that reviewers only use jedec spec ddr4-3200 ram again? That's the way OEM machines will come locked to after all. It is the "official" limit by both vendors.

That's what Anandtech is there for. I just want to see consistency all round.

coercitiv · Oct 15, 2021

uzzi38 said:
The question I have is how does Alder Lake react considering the fancy new scheduler and all. Does the scheduler actually assign TS to just the 8 big cores only or is TS running on all 24 threads?

Looks like you're trying to run the Time Spy benchmark. Would you like Thread Dictator™ to choose the optimal CPU configuration for this benchmark?
Yes / Only this time / Always choose optimal settings

(Note: during the benchmark a number of cores will be disabled. A system reboot will be required under Windows 10.)

lobz · Oct 15, 2021

Zucker2k said:
(...) I just want to see consistency all round.

Then you should also use a blender for making scrambled eggs! 🤣

Zucker2k · Oct 15, 2021

lobz said:
Then you should also use a blender for making scrambled eggs! 🤣

True story!

JoeRambo · Oct 15, 2021

coercitiv said:
Looks like you're trying to run the Time Spy benchmark. Would you like Thread Dictator™ to choose the optimal CPU configuration for this benchmark?

You know what? This is sadly not a joke. In our enterprise i know some workstations are using "Process Lasso" to set affinity masks on some old MS Access based legacy software. It won't work without automatic assignment of affinity mask to limit thread to max 2-3.

Zucker2k · Oct 15, 2021

JoeRambo said:
You know what? This is sadly not a joke. In our enterprise i know some workstations are using "Process Lasso" to set affinity masks on some old MS Access based legacy software. It won't work without automatic assignment of affinity mask to limit thread to max 2-3.

Unless core affinity is no longer a feature in W11+ADL, I'm sure some type of pass-through or direct affinity assignage is implemented. Unless MS and Intel don't care any more.

jpiniero · Oct 15, 2021

Game Dev Guide for 12th Gen Intel Core Processor Hybrid Architecture

The 12th Gen Intel© Core™ processor is a new performance hybrid architecture that combines two core types.

www.intel.com

Developer guide for Alder Lake aimed at gaming. Despite the publication date the writing appears to be pretty dated. The one thing of note is further confirmation that Intel was going to let AVX-512 work in Alder Lake if you turned off the Atom cores.

Asterox · Oct 15, 2021

jpiniero said:
Game Dev Guide for 12th Gen Intel Core Processor Hybrid Architecture

The 12th Gen Intel© Core™ processor is a new performance hybrid architecture that combines two core types.

www.intel.com

Developer guide for Alder Lake aimed at gaming. Despite the publication date the writing appears to be pretty dated. The one thing of note is further confirmation that Intel was going to let AVX-512 work in Alder Lake if you turned off the Atom cores.

Hm this, interesting but not in a positive tone.

https://twitter.com/x/status/1449040911044915201

Mopetar · Oct 15, 2021

Hulk said:
I have been out of gaming for quite a few years I admit. But when I was gaming it always seemed to be the case that the GPU was the bottleneck unless you had a really old CPU. From what I'm reading here that's not the case today right? Meaning something like a 9900K wouldn't be a good gaming CPU?

Lots of talk about CPU gaming prowess here. Just wondering how important this actually is for gamers as opposed to Intel/AMD marketing?

Depends on what you're doing. High FPS 1080p needs a faster CPU. At 4K even an i3 might be enough since the bottleneck is still the GPU.

9900K is good enough unless you're running a 3090 and gaming in 1080p.

Zucker2k · Oct 15, 2021

@JoeRambo should love this.

In general, the guidance is to avoid hard affinities, as these are a contract between the application and the OS. Using hard affinities prevents potential platform optimizations, and forces the OS to ignore any advice it receives from the Intel Thread Director (ITD). Hard affinities can be prone to unforeseen issues, and you should check to see if middleware is using hard thread affinities, as they can directly impact the application’s access to the underlying hardware. The issues with hard affinities are particularly relevant on systems with more Efficient cores than Performance cores, such as low-power devices, as hard affinities limit the OS’s ability to schedule optimally.

Determining the right affinity level for your application is critical to meet power and performance requirements. If you prefer letting the operating system and ITD do most of the heavy lifting for your thread scheduling, you may prefer a “weak” affinity strategy. You can utilize a “stronger” affinity strategy for maximum control over thread scheduling—but use caution as the application will end up running of a wide range of hardware with differing characteristics. Careful design of the underlying threading algorithms to allow for dynamic load balancing across threads and varying hardware performance characteristics is preferable to trying to try to control the OS behavior from within the application.

When choosing your affinity strategy, it is important to consider the frequency of thread context switching and cache flushing that may be incurred by changing a thread’s affinity at runtime. Many of the strong affinity API calls, such as SetThreadAffinityMask, may immediately be context switched if the thread is not currently residing on a processor specified in the affinity mask. Weaker affinity functions, such as SetThreadPriority, may not immediately force a context switch, but offer fewer guarantees about which clusters or processors your threads are executing on. Whatever strategy you choose, we recommend setting up your thread affinities at startup, or at thread initialization. Avoid setting thread affinities multiple times per frame, and keep context switching during a frame as infrequent as possible.

Intel has thread affinity covered everywhere

SetThreadAffinityMask
SetThreadAffinityMask() is in the “strong” affinity class of Windows API functions. It takes a 64-bit mask to control which of up to 64 logical processors a given thread can execute on. You can use a GroupMask to exceed 64 logical processors. However, other options, such as CPU Sets, would be preferable. SetThreadAffinityMask is essentially a contract with the operating system, and will guarantee that your threads execute only on the logical processors supplied in the bitmask. By reducing the number of logical processors that your thread can execute on, you may be reducing the overall processor time of the thread. Be cautious when using strong affinities with SetThreadAffinityMask, unless you want full control over thread scheduling. Improper use of SetThreadAffinityMask may result in poor performance since the operating system will be more constrained in how it schedules threads.

You can use SetThreadAffinityMask to segment your threading system into logical processor clusters. Using SetThreadAffinityMask you may choose to create a thread scheduler with two logical thread-pools that consist of a logical P-core cluster, and a logical E-core cluster. The operating system will still schedule within the specified cluster mask, but will not benefit from ITD power or performance optimizations. It can also be used to pin a thread to a single processor. Pinning is not ideal, however, and should only be used in extremely rare cases, such as executing an atomic operation or reading the CPUID intrinsic for a logical processor. Avoid setting affinity masks at runtime since they can force an immediate context switch. If possible, set your affinity mask only during initialization time. If you do need to swap affinity masks at runtime, try to do it as infrequently as possible to reduce context switching.

Logical processor clusters do not need to be limited to a single type of processor. You may choose to create several affinity masks for different purposes. These may include cache-mapped masks, system/process masks, cluster masks, and novel masks for specialized use-cases. The figure below shows a typical bit mask-map with affinity masks.

On handling of current apps scheduling

No Optimizations
Based on ITD feedback, the OS Scheduler intelligently schedules threads, and workload is distributed dynamically. This removes overhead on the developer side to handle scheduling tasks in software. If no optimization is done for the application, ITD will try to distribute workloads based on its algorithm. This distribution typically delivers increased performance, but in some cases, it is possible that some non-critical tasks may get assigned to Performance cores, and some critical-path tasks may get assigned to Efficient cores. That is especially possible if the application uses multiple middleware components with their own threading created by developers who are not aware of possible conflicts. Developers should review the application-threading algorithm, and choose one of the following scenarios.

Panino Manino · Oct 16, 2021

Intel Marketing is still a bit shaken by all the competition.

https://twitter.com/x/status/1448672444936564736

Ajay · Oct 16, 2021

Panino Manino said:
Intel Marketing is still a bit shaken by all the competition.

https://twitter.com/x/status/1448672444936564736

Hey Intel, u mad?

jpiniero · Oct 16, 2021

https://twitter.com/x/status/1449292869353705474

Might be Gracemont-only SKUs coming.

LightningZ71 · Oct 16, 2021

Die recovery for the 2+8 U dice?

jpiniero · Oct 16, 2021

LightningZ71 said:
Die recovery for the 2+8 U dice?

Could even be the 6+8 die although it would have to use the P packing instead.

LightningZ71 · Oct 16, 2021

Ouch, talk about limping out the door...

jpiniero · Oct 16, 2021

LightningZ71 said:
Ouch, talk about limping out the door...

Earlier rumors had talked about 1 big and 4 small for Pentiums/Celerons. Possible these are embedded models.

The mention of "UHD discrete graphics" makes me wonder if Intel might actually sell IGP busted M/P models, perhaps bundling garbage bin DG2.

DrMrLordX · Oct 16, 2021

jpiniero said:
Might be Gracemont-only SKUs coming.

8c Gracemont would be perfect for cheap AiOs and similar. Should be interesting to see what an SKU like that can do with 10W or less.

Exist50 · Oct 16, 2021

N has historically been its own die. Should be the same for whatever this is.

semiman · Oct 16, 2021

jpiniero said:
https://twitter.com/x/status/1449292869353705474

Might be Gracemont-only SKUs coming.

Intel releases atom-only SKUs for both Desktop and NUCs(and network devices). For example, N6005 uses 10nm Tremont cores. I'm pretty sure that Gracemont-only SKUs will be quite competitive in the market. IPC's quite high (between skylake and rocket lake) and boost clock's high. Maybe it can be used for a regular desktop?

eek2121 · Oct 16, 2021

DrMrLordX said:
8c Gracemont would be perfect for cheap AiOs and similar. Should be interesting to see what an SKU like that can do with 10W or less.

Competition is grand isn't it? You are talking about something that is maybe 20% slower than a 10700k, but far more power efficient. One can wish...Honestly, If Intel could give me a chip with 16-24 Gracemont cores in a (efficient) mobile form factor, they might make me stop buying AMD...

Bouowmx · Oct 16, 2021

Celeron/Pentium N-series is a separate die, but it might make sense it's to replace 2-core Celeron/Pentium G-series? Especially if it's 8 cores.

eek2121 · Oct 16, 2021

diediealldie said:
Intel releases atom-only SKUs for both Desktop and NUCs(and network devices). For example, N6005 uses 10nm Tremont cores. I'm pretty sure that Gracemont-only SKUs will be quite competitive in the market. IPC's quite high (between skylake and rocket lake) and boost clock's high. Maybe it can be used for a regular desktop?

I mean, that's the end game right? 8 Gracemont cores will beat a quad core Skylake. ST will be relatively close, MT will be close to double. Oh and they will actually be able to do that within a reasonable power budget.

LightningZ71 · Oct 16, 2021

I think that it would be technically interesting for them to release a 32 core gracemont product on the same die footprint as the 6+8 die.

Discussion Intel current and future Lakes & Rapids thread

Platinum Member

Golden Member

Diamond Member

Platinum Member

Golden Member

Golden Member

Golden Member

Lifer

Golden Member

Diamond Member

Golden Member

Golden Member

Lifer

Lifer

Platinum Member

Lifer

Platinum Member

Lifer

Lifer

Platinum Member

Member

Diamond Member

Golden Member

Diamond Member

Platinum Member