PrimeGrid Challenges 2021

Ken g6 · Jul 20, 2021

Preliminary final stats:

Rank___Credits____Username
1______51823408___Skillz
6______15085628___Icecold
7______12691648___crashtech
10_____10045948___xii5ku
14_____7809236____Pokey
38_____2746478____Orange Kid
40_____2648787____biodoc
67_____1510885____Skivelitis2
82_____1136299____Lane42
109____814316_____emoga
122____620730_____Ken_g6
141____530332_____Fardringle
344____12977______geecee

Rank__Credits____Team
1_____107476230___TeAm AnandTech
2_____75324865___SETI.Germany
3_____70641441___Antarctic Crunchers
4_____39117076___Czech National Team

Wow, that's a lot of work done!

In the challenge of Skillz vs. the rest of the TeAm...the rest of the TeAm won, barely, with 55652822 points.

Skillz · Jul 20, 2021

Ken g6 said:
Preliminary final stats:

Rank___Credits____Username
1______51823408___Skillz
6______15085628___Icecold
7______12691648___crashtech
10_____10045948___xii5ku
14_____7809236____Pokey
38_____2746478____Orange Kid
40_____2648787____biodoc
67_____1510885____Skivelitis2
82_____1136299____Lane42
109____814316_____emoga
122____620730_____Ken_g6
141____530332_____Fardringle
344____12977______geecee

Rank__Credits____Team
1_____107476230___TeAm AnandTech
2_____75324865___SETI.Germany
3_____70641441___Antarctic Crunchers
4_____39117076___Czech National Team

Wow, that's a lot of work done!

In the challenge of Skillz vs. the rest of the TeAm...the rest of the TeAm won, barely, with 55652822 points.

I shut everything down this morning.

PG-Challenge-App (SETI.Germany)

www.seti-germany.de

lane42 · Jul 20, 2021

Thanks for all the stats Ken g6

And Thanks for all the output Skillz

StefanR5R · Jul 25, 2021

The next challenge, starting 2½ weeks from now, will be at PSP-LLR. There are some similarities between this project and ESP-LLR, which was the project of the June challenge:

Both projects are "conjecture" projects: Their central point is not just about finding prime numbers, they are about finding primes (or not finding primes), in order to prove or disprove something. Particularly, "What is the smallest prime Sierpiński number?" is the big, burning question asked in PrimeGrid's PSP-LLR project.

Both are CPU-only projects, using a v9.01(mt) application version from December 2020 or January 2021 respectively.

Performance characteristics of the application in these two projects should be quite similar. PSP-LLR's search space is at bigger numbers than ESP-LLR's though. Current average CPU time at ESP-LLR is ≈160 hours, at PSP-LLR it's ≈200 hours.

During the ESP-LLR challenge, one job could still be coerced into the 16 MBytes of cache segments which Zen 2 based CPUs have. PSP-LLR jobs will want more cache, so it remains to be tested if spreading one job across two CCXs will work better. (Actually, ≈1½…2 CCX per job already worked better with the smaller ESP-LLR if you didn't enforce CPU affinity yourself on an operating system without strictly cache-aware task scheduling.) I guess owners of Zen 3 based CPUs will see quite a bit better performance at PSP-LLR then those of us who are still on Zen 2.

StefanR5R · Aug 9, 2021

Current search space of PSP-LLR (source), and corresponding FFT sizes on Haswell and similar CPUs:

k in progress	n in progress	FMA3 FFT length	FFT data size
79,309	24,192,254...24,453,182	2304K	18 MB
79,817	24,051,623...24,453,119	2304K	18 MB
152,267	23,898,819...24,453,123	2304K…2400K	18…18.75 MB
156,511	24,146,184...24,452,328	2400K	18.75 MB
222,113	24,072,701...24,453,213	2400K…2560K	18.75…20 MB
225,931	24,019,616...24,452,696	2400K…2560K	18.75…20 MB
237,019	24,166,006...24,451,666	2560K	20 MB

Edit:
By a quick look through results lists of PrimeGrid's top hosts, I found this example result:

k = 222,113 | n = 24,416,597 | credit = 40,937.20 cobblestones
on Skylake-X: all-complex AVX-512 FFT length 2520K (FFT data size 19.7 MB)
on Haswell: all-complex FMA3 FFT length 2560K (FFT data size 20 MB)

That is, this example workunit is located nearer the upper end of the current search space and is therefore well suited to synthesize some test runs.

Edit 2:
I thought I put a link to the result. Well, here it is: result ID 1246553261

Ken g6 · Aug 12, 2021

The challenge has started!

crashtech · Aug 12, 2021

This year I actually benchmarked every CPU type I'm running before the contest, so I'll be giving it all I have.

StefanR5R · Aug 13, 2021

I haven't benchmarked anything yet, too busy with work and whatnot. I am extrapolating from my llrESP measurements for the time being. (Though llrPSP will be a more or less different kettle of fish on Zen 2 because of the cache constraints.) And since this is a 10 days long challenge, and since I so much prefer to overtake rather than being overtaken, and with the weekend beeing just ahead with a little more spare time, I am making a slow start again.

StefanR5R · Aug 13, 2021

* Note to self: *
How to view the processor topology of a computer, as seen by the operating system:

NUMA topology:
grep . /sys/devices/system/node/node*/cpulist

HyperThreading/ SMT:
grep . /sys/devices/system/cpu/cpu*/topology/thread_siblings_list

Level-3 cache sharing:
grep . /sys/devices/system/cpu/cpu*/cache/index3/shared_cpu_list

Ken g6 · Aug 13, 2021

Day 1 stats:

Rank___Credits____Username
2______1693049____Skillz
4______1326828____crashtech
7______931862_____Icecold
20_____321069_____xii5ku
21_____317091_____biodoc
27_____241564_____Orange Kid
28_____237429_____emoga
112____37197______SlangNRox
130____320________Skivelitis2

Rank__Credits____Team
1_____5137176____Antarctic Crunchers
2_____5106413____TeAm AnandTech
3_____3867709____SETI.Germany
4_____2074086____Czech National Team
5_____1802790____AMD Users

It's shaping up to be a close race. But also realize that certain cloud machines may not report until the very end.

StefanR5R · Aug 13, 2021

StefanR5R said:
How to view the processor topology of a computer, as seen by the operating system:

PS,
on the Linux systems which I tested specifically for this so far, multiple LLR application instances were automatically scheduled in NUMA-aware and HT/SMT-aware fashion, but not aware of cache segments.

The latter issue can be overcome either by enforcing a processor affinity e.g. with the taskset command (tested), or with EPYC BIOS by switching to multiple NUMA nodes per socket in the BIOS (speaking theoretically, I did not test this yet).

Edit, I don't see it as a fault that the Linux process scheduler doesn't apply one or another cache-aware scheduling policy. After all, which policy is best depends on the application. In case of LLR, threads of the same process should be scheduled on physical cores which share the same last-level cache. (But use of "thread siblings" should be avoided.) Yet other applications with less intensive data sharing might benefit from a different policy.

Ken g6 said:
20_____321069_____xii5ku

Uhm what, I'm in the top-20 while using only one computer?

(But as you said, these first 24 hours aren't very well indicative for the race to come.)

Skillz · Aug 13, 2021

@Markfw

Just letting you know we could use your help. I know you offered it last Prime Grid challenge, but we won that one easily and you didn't need to join us.

This one, on the other hand, we could probably use some help.

Markfw · Aug 13, 2021

Skillz said:
@Markfw

Just letting you know we could use your help. I know you offered it last Prime Grid challenge, but we won that one easily and you didn't need to join us.

This one, on the other hand, we could probably use some help.

Its over 100 degrees here ! I can donate my 24 core EPYC (48 threads) but it is HOT here ! How long does this run ? Heat wave over Maybe Sunday , but for sure Monday.

Edit: Its running. It configured itself to 3 16 cpu tasks.

Ken g6 · Aug 13, 2021

Markfw said:
How long does this run ?

10 days.

You could probably wait a week and see what things look like then.

Markfw · Aug 13, 2021

Ken g6 said:
10 days. You could probably wait a week and see what things look like then.

Well, its not supposed to be over 90 after Sunday, so I could add a few !

Markfw · Aug 13, 2021

Its doing 17 or bust units. Is that correct for the current challenge ?

crashtech · Aug 13, 2021

Markfw said:
Its doing 17 or bust units. Is that correct for the current challenge ?

No! The current project is llrPSP:

Once in a Blue Moon Challenge

Markfw · Aug 13, 2021

crashtech said:
No! The current project is llrPSP:

Once in a Blue Moon Challenge

Is this correct now ?
PrimeGrid 9.01 Prime Sierpinski Problem (LLR) (mt) llrPSP_380377879_1 00:05:29 (01:15:28) 1373.20 0.270 01d,09:43:45 9/3/2021 6:02:26 PM 16C Running EPYC 7401P

Icecold · Aug 13, 2021

Markfw said:
Is this correct now ?
PrimeGrid 9.01 Prime Sierpinski Problem (LLR) (mt) llrPSP_380377879_1 00:05:29 (01:15:28) 1373.20 0.270 01d,09:43:45 9/3/2021 6:02:26 PM 16C Running EPYC 7401P

Yes, that is the correct project / task name.

Glad to have you join in on it! Hopefully the weather cooperates and you can fire up some more machines

Markfw · Aug 13, 2021

I have 256 threads sitting idle, so I may be able to do a lot.

lane42 · Aug 13, 2021

Ill be a little late to the party, but lll be there

crashtech · Aug 13, 2021

Yeah, thanks @Markfw! I have a feeling the regulars over at PrimeGrid are wondering what is this burr in the saddle, TeAm Anandtech!

Skillz · Aug 13, 2021

Thank you @Markfw

You rock bro!

StefanR5R · Aug 14, 2021

StefanR5R said:
During the ESP-LLR challenge, one job could still be coerced into the 16 MBytes of cache segments which Zen 2 based CPUs have. PSP-LLR jobs will want more cache, so it remains to be tested if spreading one job across two CCXs will work better. (Actually, ≈1½…2 CCX per job already worked better with the smaller ESP-LLR if you didn't enforce CPU affinity yourself on an operating system without strictly cache-aware task scheduling.)

The gist of llrESP tests in June and llrPSP tests today on EPYC 7452 (only the tests in which I enforced processor affinity, power efficiency relates to the complete computer power draw "at the wall"):

llrESP
- 1:1 ratio of tasks:CCXs = best throughput and best power efficiency for llrESP
- 1:2 ratio of tasks:CCXs = -9% throughput, -7% efficiency
llrPSP
- 1:2 ratio of tasks:CCXs = best throughput and best power efficiency for llrPSP
- 1:1 ratio of tasks:CCXs = -8% throughput, -17% efficiency

The intensive RAM I/O which goes on when llrPSP exhausts the caches in the 1:1 configuration causes a notably higher power draw. This means that the RAM I/O for inter-thread communication across CCXs (llrPSP 1:2 case) is not as costly as the RAM I/O from cache deficit (llrPSP 1:1 case).

— Edit: —
Nonetheless, communication across CCXs is very costly with the LLR application:

llrESP
- 1:1 ratio of tasks:CCXs, tasks pinned to the CCXs = best throughput and best power efficiency for llrESP
- 1:1 ratio of tasks:CCXs, tasks scheduled randomly be the Linux kernel = -30% throughput, -32% efficiency

StefanR5R said:
I guess owners of Zen 3 based CPUs will see quite a bit better performance at PSP-LLR then those of us who are still on Zen 2.

I heard somewhere that 5950X has got about 1.3 times the llrPSP throughput of 3950X.

Edit: This seems to align well with the above mentioned 30% cost of inter-CCX comms. Though in case of my llrESP tests, these 30% are from random scheduling vs. task pinning, whereas the 30% performance uplift of Zen 3 over Zen 2 in llrPSP is mostly from cache exhaustion on Zen 2. From what I understood, the 5950X/3950X figures which I saw were without task pinning.

Markfw · Aug 14, 2021

The stats say I have no points for today. Is that due to my config ? (`16 cpus each task, 12 hours remaining on the first batch of 3) Should I change that ?

PrimeGrid Challenges 2021

Programming Moderator, Elite Member

Golden Member

Diamond Member

Elite Member

Elite Member

Programming Moderator, Elite Member

Lifer

Elite Member

Elite Member

Programming Moderator, Elite Member

Elite Member

Golden Member

Moderator Emeritus, Elite Member

Programming Moderator, Elite Member

Moderator Emeritus, Elite Member

Moderator Emeritus, Elite Member

Lifer

Moderator Emeritus, Elite Member

Golden Member

Moderator Emeritus, Elite Member

Diamond Member

Lifer

Golden Member

Elite Member

Moderator Emeritus, Elite Member