Been a long time

Bradtech519

Senior member
Jul 6, 2010
520
46
91
Hope you all are well. I just finished building a new rig. Core I7 12700KF with a 6650XT along with a scythe Mugen 5 Rev.C CPU cooler. Will be throwing in some cycle burning this guy in. These new CPUs run quite a bit warmer than the old FX8350 and Core I7 4770. My last updates.
 

Orange Kid

Elite Member
Oct 9, 1999
4,194
1,907
146
Congrats.
We have a PrimeGrid challenge starting tomorrow. That will warm up the new rig. 🙂
 
  • Like
Reactions: Ken g6

Skillz

Senior member
Feb 14, 2014
632
564
136
Yes you will need to wait until the start before you can download the work.

However, nothing wrong with running a few here and there to make sure your system is stable and fined tuned for them.

Also, welcome back. Come join us on Discord.
 

StefanR5R

Elite Member
Dec 10, 2016
4,777
6,039
136
Looks like the challenge is for Sierpinski/Riesel Base 5 LLR (SR5) workunits. Does this look like the right one?
It's the right one, but too many at once to run optimally. Each SR5-LLR instance leaves a footprint of about 8 MBytes in the processor caches.

Core i7-12700KF has got 1.25 MB L2$ on each of the 8 P cores, 2.0 MB L2$ on the cluster of E cores, and 25 MB L3$ shared between all P cores and the E cluster. (I don't recall the L3$ policy; my guess is it's non-inclusive.) Now that's a rather complex topology with hard to predict performance characteristics. I suppose that cache misses become very frequent if there are 4 or more SR5-LLR instances running together. Which means that the processor's execution units will sit partially idle, waiting for read/write operations to main memory.

The solution, short version:
Try "Multi-threading: Max # of threads for each task"=7 in the PrimeGrid web preferences.

Long version:

You can use either an app_config.xml, or, perhaps more convenient but not as in-depth, the PrimeGrid web preferences to control the SR5-LLR workload and behavior. The PrimeGrid preferences webpage has got these two relevant options:
  • You can specify that the program should use more than 1 program thread. After you changed this and after your boinc client downloaded a new task, the client will see that this task is going to occupy respectively many logical CPUs, and launch respectively fewer of such tasks. And of course the application will pick up this setting too and spin up respectively many program threads.
  • Less importantly, you can also configure the server-enforced per-host limit of "tasks in progress". (It's the server's perspective of "in progress", i.e. for the time between a task was assigned by the server to the host, and until the host reports the result back to the server.) This option is somewhat misleadingly called "Max # of simultaneous PrimeGrid tasks" on the web page.

So, purely based on cache sizes, it seems plausible that it's OK to run 1, or 2, or 3 of those SR5-LLR tasks at once, while at 4 or more the host throughput will degrade. Next step would be to figure out how many program threads per task to configure.
  • The general goal would be to use all cores.
  • The more threads are used per task, the shorter the run time of an individual task will be (up to a point of diminishing returns, or higher up even negative scaling).
  • On the other hand, the higher the thread count, the more processing time and portions of the power budget will be spent with synchronization overhead, which is detrimental to host throughput.
  • LLR makes heavy use of AVX units. In case of hyperthreaded cores, both hyperthreads of a physical core would compete for access to one and the same AVX unit. Hence, Hyperthreading does not scale at all for LLR.
And then there is a specific complication with Alder-Lake S:
The performances of P cores and E cores are very different. But I suspect that multithreaded LLR works best if all program threads work with the same speed.

I don't have such a processor myself. But it seems that many workloads get about the same performance out of an E core as from one Hyperthread of a P core, provided that both Hyperthreads of the P core are fully loaded. So maybe it would be prudent to configure the # of concurrently running SR5-LLR tasks and the # of program threads per task such that all, or almost all, Hyperthreads of P cores and all, or almost all, E cores get used.

Another promising, but more complicated ti implement configuration would be to run 2 quick tasks with 4 threads each, "pinning" these tasks to P cores (and don't use Hyperthreading on them), and in addition 1 slow task with 4 threads, "pinned" to the E core cluster.

And a further config to explore would be to leave the E cores unused (perhaps even deactivate them in the BIOS) and let just the P cores do the work. This would reduce the amount of available execution units, but perhaps this loss would be offset by all of the power budget being available to the P cores. Due to its heavy AVX usage, LLR scales quite reasonably with per-core power budget.

PS: Putting all this guesswork aside, the optimum config could be found empirically with systematic benchmarking of the LLR program outside of BOINC. But how to do that is a whole other story, for another day. I wrote about that elsewhere a while ago.
 

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
15,811
3,028
55
Welcome back, @Bradtech519! I need to upgrade my i7-6700 system sometime, but I think I'll wait for Zen 4, and Socket AM5. Supposedly they'll keep that socket around awhile.
 
  • Like
Reactions: Bradtech519

Bradtech519

Senior member
Jul 6, 2010
520
46
91
Welcome back, @Bradtech519! I need to upgrade my i7-6700 system sometime, but I think I'll wait for Zen 4, and Socket AM5. Supposedly they'll keep that socket around awhile.
That would probably be the best bet. I was just going to order a new video card. But decided most of my stuff was going on 8-10 years old. Saw the Z690 stuff out, and Intel had latest refresh. Went with the 6650XT for 1080p gaming. Looked to perform better than the 3060 RTX and was only $399 plus tax. I did miss the 5800x3d chip when shopping around. Or I may have went AMD. I'm still running DDR4 but have PCI Express 5.0. My Bulldozer system finally died on me. But the Core I7 4770/R9 290 setup and Gigabyte board are now my daughters.
 
  • Like
Reactions: Skillz

Markfw

CPU Moderator, VC&G Moderator, Elite Member
Super Moderator
May 16, 2002
23,836
12,918
136
That would probably be the best bet. I was just going to order a new video card. But decided most of my stuff was going on 8-10 years old. Saw the Z690 stuff out, and Intel had latest refresh. Went with the 6650XT for 1080p gaming. Looked to perform better than the 3060 RTX and was only $399 plus tax. I did miss the 5800x3d chip when shopping around. Or I may have went AMD. I'm still running DDR4 but have PCI Express 5.0. My Bulldozer system finally died on me. But the Core I7 4770/R9 290 setup and Gigabyte board are now my daughters.
Welcome back @Bradtech519 ! I have a 12700F, very similar to yours, but I disabled the e-cores. It will be interesting to see how the 2 do ! I may re-enable them if it helps.
 
  • Like
Reactions: Bradtech519

Bradtech519

Senior member
Jul 6, 2010
520
46
91
Welcome back @Bradtech519 ! I have a 12700F, very similar to yours, but I disabled the e-cores. It will be interesting to see how the 2 do ! I may re-enable them if it helps.
I'm also curious how well the "thread director" micro controller on the CPU and windows 11 scheduler will handle the workload. Compared to the 11th Gen CPUs it looked like a lot of gain. I read this before buying.

 

Markfw

CPU Moderator, VC&G Moderator, Elite Member
Super Moderator
May 16, 2002
23,836
12,918
136
  • Like
Reactions: Bradtech519

Bradtech519

Senior member
Jul 6, 2010
520
46
91
Ended up getting some bad RAM so had to drop from that race early. Along with sending the 6650 back for a 6700. Got my old FX8150 hooked back up with a new cooling solution and upgraded the cooling solution for the 4770s going now as well. Going to hit up some projects I haven't farmed before. BTW, these newer CPUs love to run hot. I'm in the 60s-70s running 8 out of 16 cores on this 12700kf. My old bulldozer FX8150 is in the low to mid 40s going 100%. 4770 low 50s.
 

Skillz

Senior member
Feb 14, 2014
632
564
136
Ended up getting some bad RAM so had to drop from that race early. Along with sending the 6650 back for a 6700. Got my old FX8150 hooked back up with a new cooling solution and upgraded the cooling solution for the 4770s going now as well. Going to hit up some projects I haven't farmed before. BTW, these newer CPUs love to run hot. I'm in the 60s-70s running 8 out of 16 cores on this 12700kf. My old bulldozer FX8150 is in the low to mid 40s going 100%. 4770 low 50s.
There is an active PrimeGrid challenge the team is participating in right now.

Just make sure you are on the team and running the correct CPU sub project.
 
  • Like
Reactions: Bradtech519

Bradtech519

Senior member
Jul 6, 2010
520
46
91
There is an active PrimeGrid challenge the team is participating in right now.

Just make sure you are on the team and running the correct CPU sub project.
Something about these primegrid tasks that really crank up the heat. Hit 100c right away with a good air cooler on the 12700kf. Normally in the 60s on SiDock going 50%. FX8150 seems to be not having issues.
 

StefanR5R

Elite Member
Dec 10, 2016
4,777
6,039
136
The PPS subproject of PrimeGrid, like most of them, makes intense use of vector arithmetics (e.g. AVX, FMA3, AVX-512, depending on support in the hardware). One way to keep the heat output in check would be to set a decent power limit in the BIOS. Default power limits tend to be over the top nowadays.
 

Skillz

Senior member
Feb 14, 2014
632
564
136
Also how many tasks are you running at once? If you are using 1 task for every thread then that's probably too much. You should be running 1 task per physical core.
 
  • Like
Reactions: Bradtech519

Bradtech519

Senior member
Jul 6, 2010
520
46
91
Also how many tasks are you running at once? If you are using 1 task for every thread then that's probably too much. You should be running 1 task per physical core.
Yeah, I believe I was doing one task per thread.
 
Thread starter Similar threads Forum Replies Date
Markfw Distributed Computing 8
Similar threads
7950x times in WCG vs 5950x

ASK THE COMMUNITY