Info PrimeGrid Challenges 2024, sieve-free edition

StefanR5R · Oct 31, 2024

pututu said:
Oh wait my forum avatar shows that I should be supporting team Ukraine, hmm

[H]ard|OCP's slipstream pulled us over the finish line. :-D

Markfw · Nov 10, 2024

OK, I only have one 9554 running, but it did one unit in 17 hours, but the next 8 have been running for 20 hours and not done yet. These are LONG running units ! almost 24 hours it looks like ! (8C config) Should I try 16 C tasks ? Only a little over 2 days to figure this out.

emoga · Nov 10, 2024

@Markfw, I was going to check on your computers, but it looks like you've hidden them.

Markfw · Nov 10, 2024

emoga said:
@Markfw, I was going to check on your computers, but it looks like you've hidden them.

I will unhide... Hang on.

Markfw · Nov 10, 2024

well, crap. Somebody showed me how to hide them, now I can't find is again to unhide !

Help !

emoga · Nov 10, 2024

Markfw said:
well, crap. Somebody showed me how to hide them, now I can't find is again to unhide !

Help !

Markfw · Nov 10, 2024

@emoga Thanks ! done ( I think) let me know !

Edit : Not the logical place to put it when you have multiple profiles, you don't expect one to be different from all the rest.

Markfw · Nov 10, 2024

Almost 23 hours for most of them, dang !!!!

StefanR5R · Nov 11, 2024

Markfw said:
OK, I only have one 9554 running, but it did one unit in 17 hours, but the next 8 have been running for 20 hours and not done yet. These are LONG running units ! almost 24 hours it looks like ! (8C config) Should I try 16 C tasks ? Only a little over 2 days to figure this out.

The "edit PrimeGrid preferences" web page says:

Recent average CPU time: 215:48:55
FFT sizes: 3200K to 3840K (uses up to 30720K cache per task)
This project has a 35% long job credit bonus and a 10% conjecture credit bonus.

Recent average 216 CPU-hours / 8 CPUs = 27 hours --> As expected, your 9554 ES completes them at above-average speed.

Up to 30 MBytes "cache per task" means that each of the eight 8c/16t CCDs of the 9554 can run 1 task at once, as each CCD = CCX has got 32 MB level 3 cache. It is crucial for performance that each task gets individually bound to the logical CPUs of one CCD exclusively for this task. Otherwise, a lot of time- and energy-wasting cross-CCX and memory traffic would occur.

25...30 MBytes cache per task is a hint that the current workunit sizes are quite variable. Therefore, to answer the question how many logical CPUs of each CCD should be used (8 or 16 threads per task), one would either have to complete several workunits and note both the task duration _and_ the credit per task, then evaluate points per day for 8 vs. 16 threads. Or one would have to test "offline" (outside BOINC) with a single fixed workunit, which would allow for much quicker testing based on completion rate, rather than based on total duration and credits. I'll take a look at home later today whether I have notes from similar tests, or perhaps run some offline tests on my 9554 myself.

An alternative to 8 tasks at once, 1 CCX/task would be to run only 4 tasks at once, 2 CCXs/task. Then you would obviously spend twice as many cores per task and reduce the task durations that way. But this would incur the cost of data traffic across two CCXs for each task. Therefore, and due to increased program overhead, this would not yield twice the speed per task. In other words, this would sacrifice throughput. It could be a viable option though if some CCXs would otherwise be idle during the last (maybe) 14+ hours (or something like that) of the challenge.

StefanR5R · Nov 11, 2024

The projects of the previous three challenges used the PRST program, but AFAICT llrPSP is still on LLR2. Hopefully the PrimeGrid admins don't do a last-minute stunt and update the application before the challenge... :-)

StefanR5R · Nov 11, 2024

I couldn't find a validated PSP result from a 3840K workunit. So I took a 3456K workunit from the results table of Pavel Atnashev's computer cluster instead. (That's 27.0 MB cache footprint of FFT coefficients.) It's the WU with the largest credit on this host when I looked about two hours ago. I ran this WU for 20 minutes per test and extrapolated total duration from the progress made until then.

workunit: 222113*2^34206293+1 for 82,165.65 credits
software: SuSE Linux, display-manager shut down, sllr2_1.3.0_linux64_220821
hardware: EPYC 9554P (Zen 4 Genoa 64c/128t), cTDP = PPT = 400 W, 12 channels of DDR5-4800

test	affinity	avg. duration	avg. tasks/day	avg. PPD	avg. core clock	host power	power efficiency
8×8	none (random scheduling by Linux)	35:49:20 (128960 s)	5.4	0.440 M	3.60 GHz	370 W	1.19 kPPD/W
8×8	1 task : 1 CCX, only lower SMT threads	12:52:37 (46357 s)	14.9	1.225 M	3.34 GHz	485 W	2.53 kPPD/W
8×16	1 task : 1 CCX, all SMT threads	13:02:32 (46952 s)	14.7	1.210 M	3.05 GHz	500 W	2.42 kPPD/W
4×16	1 task : 2 CCXs, only lower SMT threads	8:35:14 (30914 s)	11.1	0.919 M	3.60 GHz	480 W	1.91 kPPD/W
4×32	1 task : 2 CCXs, all SMT threads	8:39:42 (31182 s)	11.0	0.911 M	3.18 GHz	490 W	1.86 kPPD/W

Conclusions for this particular host:

CPU affinity is a must.¹ Surprise! ;-)
SMT doesn't help.
As I said, running tasks on 2 CCXs instead of on 1 CCX reduces throughput quite a bit, but may be useful towards the very end of the challenge if you don't mind doing some micro-management.
If you see one configuration cause the CPU running at higher clock than another one, then the reason may be that the CPU is simply waiting more often for memory accesses in the former config, that is, it's just twiddling its thumbs very fast instead of calculating fast.

@Markfw, since you are getting way more than 20 hours, either CPU affinities are not applied (or not correctly applied), or your ES has a rather restrictive frequency cap. I don't know what PPT limit your ES has, maybe it is 360 W like the production 9554's default, which would still be 90% of my increased PPT limit, so that's perhaps not the reason for your worse task durations.

Edit: Also, right now you have 13 valid proof tasks on your host. Their points-per-seconds are 1.57, 0.53, 1.18, 0.95, 1.40, 0.83, 1.50, 0.75, 0.72, 0.72, 1.54, 0.85, 0.84. That's very high variability, which hints at lack of or suboptimal CPU affinities.

________
¹) On a Windows host, the performance drop from 8×8 with affinity to 8×8 witout affinity may be even worse than on Linux (due to dumber scheduler and more stuff going on in the background).

emoga · Nov 11, 2024

@Markfw Could you post a screenshot of your 9554s Lasso configuration when you have a chance?

Markfw · Nov 11, 2024

emoga said:
@Markfw Could you post a screenshot of your 9554s Lasso configuration when you have a chance?

Hence the problem... Its not even RUNNING Stupid me.

Markfw · Nov 11, 2024

So I turned it on, and it will stay on for the competition. But I will have no new tasks set shortly to be ready for the competition. So igmore all my previous run times, and than a lot @StefanR5R , @emoga

Markfw · Nov 11, 2024

Well, stupid me, glad it's back on, now 14.5 hours instead of 27. no other changes.

Markfw · Nov 12, 2024

So, after lasso, a fresh set of tasks, is 17 hours looking good ? I am ready for now. 7 7950x, a 9950x, and 4 Genoa 64 core and one 96 core Genoa, all ready for 11 PM tonight including lasso running and configured using llr2_ for the pattern. In a couple of days, I should have 128 cores of Turin going !

Markfw · Nov 13, 2024

ok running 60 tasks all pinned 8 hours up to about 30 hours

Markfw · Nov 13, 2024

3rd place and only the 9950x has dumped ??
by noon it will be a very different picture.

Looks like 21 more of the big units by noon PST

40 more by 13 hours after that.

Markfw · Nov 13, 2024

we are number one !

Challenge statistics

StefanR5R · Nov 13, 2024

There will be computers in this race which take more than a day for these tasks.

Markfw · Nov 13, 2024

StefanR5R said:
There will be computers in this race which take more than a day for these tasks.

Thats why I will need a big head start. Made number one. For how long I don't know. Turin motherboard on the truck. Due in the next 3 hours. That may save my bacon.

AC is up to 80F and its 50F outside ! as much as I can, the house is open (its a big windy storm outside)

Edit: actually its due in the next hour.Well, the 7742 with 8C units will take almost 2 days (about) Once the Turin is up, and it has finished one set of tasks, I will shut it off, so I know what you mean about slow computers. And yeas that is with pinning

PrimeGrid 9.03 Prime Sierpinski Problem (LLR) (mt) llrPSP_590816579_0 11:54:00 (03d,20:02:01) 96.67 21.710 01d,18:52:22 20d,12:05:49 8C Running 7742 dual Titan V

Markfw · Nov 13, 2024

Damn Fedex ! First its supposed to be between 9:30 and 11:30. Well as soon as they miss that, now it "before 10 PM". At least the case is here and the PSU is in it all ready,

Markfw · Nov 13, 2024

Our team is not TWICE the points of 2nd place !

1	TeAm AnandTech	2 945 985.66	50
2	BOINC@MIXI	1 536 209.81	41

crashtech · Nov 13, 2024

It's too early to draw any conclusions.

Ken g6 · Nov 14, 2024

Day 1 stats:

Rank___Credits____Username
1______6363622____markfw
3______4351398____w a h
5______2766489____Icecold
9______1393430____cellarnoise2
16_____805537_____crashtech
37_____343604_____ChelseaOilman
60_____165784_____mmonnin
66_____93579______Ken_g6

Rank__Credits____Team
1_____16283446___TeAm AnandTech
2_____4762054____SETI.Germany
3_____4496571____Czech National Team
4_____4484338____Romania

Markfw said:
Our team is not TWICE the points of 2nd place !

Not quite four times now! If Mark was a team he'd be first (or second to the rest of us?); if @emoga was a team he'd be a strong 5th.

Info PrimeGrid Challenges 2024, sieve-free edition

Elite Member

Moderator Emeritus, Elite Member

Senior member

Moderator Emeritus, Elite Member

Moderator Emeritus, Elite Member

Senior member

Moderator Emeritus, Elite Member

Moderator Emeritus, Elite Member

Elite Member

Elite Member

Elite Member

Senior member

Moderator Emeritus, Elite Member

Moderator Emeritus, Elite Member

Moderator Emeritus, Elite Member

Moderator Emeritus, Elite Member

Moderator Emeritus, Elite Member

Moderator Emeritus, Elite Member

Moderator Emeritus, Elite Member

Elite Member

Moderator Emeritus, Elite Member

Moderator Emeritus, Elite Member

Moderator Emeritus, Elite Member

Lifer

Programming Moderator, Elite Member