Discussion Intel current and future Lakes & Rapids thread

Page 549 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

DrMrLordX

Lifer
Apr 27, 2000
21,582
10,785
136
Why?

Being designed for V-cache from the onset, what if there is no integral L3 cache but all is in the "V" chiplet? With different libraries used, we might have a reduction in total silicon area that negates the increased packaging costs.

Raphael will probably use Genoa dice.
 
  • Like
Reactions: Tlh97 and Ajay

LightningZ71

Golden Member
Mar 10, 2017
1,627
1,898
136
So, are we now demanding that reviewers only use jedec spec ddr4-3200 ram again? That's the way OEM machines will come locked to after all. It is the "official" limit by both vendors.
 
  • Haha
Reactions: Zucker2k

coercitiv

Diamond Member
Jan 24, 2014
6,151
11,685
136
The question I have is how does Alder Lake react considering the fancy new scheduler and all. Does the scheduler actually assign TS to just the 8 big cores only or is TS running on all 24 threads?
Looks like you're trying to run the Time Spy benchmark. Would you like Thread Dictator™ to choose the optimal CPU configuration for this benchmark?
Yes / Only this time / Always choose optimal settings

(Note: during the benchmark a number of cores will be disabled. A system reboot will be required under Windows 10.)
 

JoeRambo

Golden Member
Jun 13, 2013
1,814
2,105
136
Looks like you're trying to run the Time Spy benchmark. Would you like Thread Dictator™ to choose the optimal CPU configuration for this benchmark?

You know what? This is sadly not a joke. In our enterprise i know some workstations are using "Process Lasso" to set affinity masks on some old MS Access based legacy software. It won't work without automatic assignment of affinity mask to limit thread to max 2-3.
 

Zucker2k

Golden Member
Feb 15, 2006
1,810
1,159
136
You know what? This is sadly not a joke. In our enterprise i know some workstations are using "Process Lasso" to set affinity masks on some old MS Access based legacy software. It won't work without automatic assignment of affinity mask to limit thread to max 2-3.
Unless core affinity is no longer a feature in W11+ADL, I'm sure some type of pass-through or direct affinity assignage is implemented. Unless MS and Intel don't care any more.
 

jpiniero

Lifer
Oct 1, 2010
14,510
5,159
136

Developer guide for Alder Lake aimed at gaming. Despite the publication date the writing appears to be pretty dated. The one thing of note is further confirmation that Intel was going to let AVX-512 work in Alder Lake if you turned off the Atom cores.
 
  • Like
Reactions: Tlh97 and coercitiv

Asterox

Golden Member
May 15, 2012
1,026
1,775
136

Developer guide for Alder Lake aimed at gaming. Despite the publication date the writing appears to be pretty dated. The one thing of note is further confirmation that Intel was going to let AVX-512 work in Alder Lake if you turned off the Atom cores.

Hm this, interesting but not in a positive tone.

 

Mopetar

Diamond Member
Jan 31, 2011
7,797
5,899
136
I have been out of gaming for quite a few years I admit. But when I was gaming it always seemed to be the case that the GPU was the bottleneck unless you had a really old CPU. From what I'm reading here that's not the case today right? Meaning something like a 9900K wouldn't be a good gaming CPU?

Lots of talk about CPU gaming prowess here. Just wondering how important this actually is for gamers as opposed to Intel/AMD marketing?

Depends on what you're doing. High FPS 1080p needs a faster CPU. At 4K even an i3 might be enough since the bottleneck is still the GPU.

9900K is good enough unless you're running a 3090 and gaming in 1080p.
 
  • Like
Reactions: Tlh97 and Rigg

Zucker2k

Golden Member
Feb 15, 2006
1,810
1,159
136
@JoeRambo should love this.

In general, the guidance is to avoid hard affinities, as these are a contract between the application and the OS. Using hard affinities prevents potential platform optimizations, and forces the OS to ignore any advice it receives from the Intel Thread Director (ITD). Hard affinities can be prone to unforeseen issues, and you should check to see if middleware is using hard thread affinities, as they can directly impact the application’s access to the underlying hardware. The issues with hard affinities are particularly relevant on systems with more Efficient cores than Performance cores, such as low-power devices, as hard affinities limit the OS’s ability to schedule optimally.

Determining the right affinity level for your application is critical to meet power and performance requirements. If you prefer letting the operating system and ITD do most of the heavy lifting for your thread scheduling, you may prefer a “weak” affinity strategy. You can utilize a “stronger” affinity strategy for maximum control over thread scheduling—but use caution as the application will end up running of a wide range of hardware with differing characteristics. Careful design of the underlying threading algorithms to allow for dynamic load balancing across threads and varying hardware performance characteristics is preferable to trying to try to control the OS behavior from within the application.

When choosing your affinity strategy, it is important to consider the frequency of thread context switching and cache flushing that may be incurred by changing a thread’s affinity at runtime. Many of the strong affinity API calls, such as SetThreadAffinityMask, may immediately be context switched if the thread is not currently residing on a processor specified in the affinity mask. Weaker affinity functions, such as SetThreadPriority, may not immediately force a context switch, but offer fewer guarantees about which clusters or processors your threads are executing on. Whatever strategy you choose, we recommend setting up your thread affinities at startup, or at thread initialization. Avoid setting thread affinities multiple times per frame, and keep context switching during a frame as infrequent as possible.

Intel has thread affinity covered everywhere

SetThreadAffinityMask
SetThreadAffinityMask() is in the “strong” affinity class of Windows API functions. It takes a 64-bit mask to control which of up to 64 logical processors a given thread can execute on. You can use a GroupMask to exceed 64 logical processors. However, other options, such as CPU Sets, would be preferable. SetThreadAffinityMask is essentially a contract with the operating system, and will guarantee that your threads execute only on the logical processors supplied in the bitmask. By reducing the number of logical processors that your thread can execute on, you may be reducing the overall processor time of the thread. Be cautious when using strong affinities with SetThreadAffinityMask, unless you want full control over thread scheduling. Improper use of SetThreadAffinityMask may result in poor performance since the operating system will be more constrained in how it schedules threads.

You can use SetThreadAffinityMask to segment your threading system into logical processor clusters. Using SetThreadAffinityMask you may choose to create a thread scheduler with two logical thread-pools that consist of a logical P-core cluster, and a logical E-core cluster. The operating system will still schedule within the specified cluster mask, but will not benefit from ITD power or performance optimizations. It can also be used to pin a thread to a single processor. Pinning is not ideal, however, and should only be used in extremely rare cases, such as executing an atomic operation or reading the CPUID intrinsic for a logical processor. Avoid setting affinity masks at runtime since they can force an immediate context switch. If possible, set your affinity mask only during initialization time. If you do need to swap affinity masks at runtime, try to do it as infrequently as possible to reduce context switching.

Logical processor clusters do not need to be limited to a single type of processor. You may choose to create several affinity masks for different purposes. These may include cache-mapped masks, system/process masks, cluster masks, and novel masks for specialized use-cases. The figure below shows a typical bit mask-map with affinity masks.

1634320087426.png

On handling of current apps scheduling

No Optimizations
Based on ITD feedback, the OS Scheduler intelligently schedules threads, and workload is distributed dynamically. This removes overhead on the developer side to handle scheduling tasks in software. If no optimization is done for the application, ITD will try to distribute workloads based on its algorithm. This distribution typically delivers increased performance, but in some cases, it is possible that some non-critical tasks may get assigned to Performance cores, and some critical-path tasks may get assigned to Efficient cores. That is especially possible if the application uses multiple middleware components with their own threading created by developers who are not aware of possible conflicts. Developers should review the application-threading algorithm, and choose one of the following scenarios.
 
Last edited:

jpiniero

Lifer
Oct 1, 2010
14,510
5,159
136
Ouch, talk about limping out the door...

Earlier rumors had talked about 1 big and 4 small for Pentiums/Celerons. Possible these are embedded models.

The mention of "UHD discrete graphics" makes me wonder if Intel might actually sell IGP busted M/P models, perhaps bundling garbage bin DG2.
 

diediealldie

Member
May 9, 2020
77
68
61

Intel releases atom-only SKUs for both Desktop and NUCs(and network devices). For example, N6005 uses 10nm Tremont cores. I'm pretty sure that Gracemont-only SKUs will be quite competitive in the market. IPC's quite high (between skylake and rocket lake) and boost clock's high. Maybe it can be used for a regular desktop?
 

eek2121

Platinum Member
Aug 2, 2005
2,904
3,906
136
8c Gracemont would be perfect for cheap AiOs and similar. Should be interesting to see what an SKU like that can do with 10W or less.

Competition is grand isn't it? You are talking about something that is maybe 20% slower than a 10700k, but far more power efficient. One can wish...Honestly, If Intel could give me a chip with 16-24 Gracemont cores in a (efficient) mobile form factor, they might make me stop buying AMD...
 

Bouowmx

Golden Member
Nov 13, 2016
1,138
550
146
Celeron/Pentium N-series is a separate die, but it might make sense it's to replace 2-core Celeron/Pentium G-series? Especially if it's 8 cores.
 

eek2121

Platinum Member
Aug 2, 2005
2,904
3,906
136
Intel releases atom-only SKUs for both Desktop and NUCs(and network devices). For example, N6005 uses 10nm Tremont cores. I'm pretty sure that Gracemont-only SKUs will be quite competitive in the market. IPC's quite high (between skylake and rocket lake) and boost clock's high. Maybe it can be used for a regular desktop?

I mean, that's the end game right? 8 Gracemont cores will beat a quad core Skylake. ST will be relatively close, MT will be close to double. Oh and they will actually be able to do that within a reasonable power budget.
 
  • Like
Reactions: lightmanek