PrimeGrid Challenges 2019

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,242
3,830
75
Current challenge: Prime Sierpinski Problem (PSP) LLR, December 12-21 (04:19 UTC)

Happy new year! Here's the (tentative) list of this year's PrimeGrid challenges:

Code:
#  Date             Time UTC  Project  Duration  Challenge
-----------------------------------------------------------------------------------------------------------------
1   7-22 January    05:43:00  SoB-LLR  15 days   Conjunction of Venus & Jupiter Challenge
2   5-10 March      18:00:00  GCW-LLR  5 days    Year of the Pig(ging out on our CPU cycles :P) Challenge
3  24-31 May        00:00:00  TRP-LLR  7 days    Hans Ivar Riesel's 90th Birthday Challenge
4  15-20 July       20:17:00  PPS-LLR  5 days    50th Anniversary of the Moon Landing Challenge
5   3-10 August     00:00:00  ESP-LLR  7 days    Lennart Vogel Honorary Challenge
6  21-26 September  11:00:00  AP27     5 days    Oktoberfest Challenge
7  10-15 October    18:00:00  PPS-DIV  5 days    World Maths Day Challenge
8  24-29 October    00:00:00  321-LLR  5 days    50 years First ARPANET Connection Challenge
9   1-11 November   18:04:00  PSP-LLR  10 days   Transit of Mercury Across the Sun Challenge
10 12-22 December   04:19:00  GFN-21+  10 days   Aussie, Aussie, Aussie! Oi! Oi! Oi! Summer Solstice Challenge

What you need:
  • One or more fast x86 processors, preferably with lots of cores. (Even slow ones might do!)
  • Windows (Vista or later 64-bit, or XP or later 32-bit), Linux, or MacOS 10.4+.
  • BOINC, attached to PrimeGrid (http://www.primegrid.com/).
  • Your PrimeGrid Preferences with only the above project(s) selected in the Projects section.
  • Patience! All of these projects run long, slow WUs, at least on your CPU. As a result, no challenge is less than five days long. :eek:

What may help LLR (all but two of the challenges):
  • An Intel Sandy Bridge or later ("Core series" other than first-generation) processor with AVX may be 20-70% faster than with the default application. Sadly, that does not include Pentium or Celeron processors, or AMD processors.
  • In most challenges - probably all of these since their WUs are so large - it helps to enable multi-core processing with app_config.xml. Leave hyper-threading on if you do this!
  • Faster RAM might help on many challenges, as long as it's stable.
What may help in other challenges:
  • A GPU helps in two challenges.
  • Juggling in some extra WUs may help in challenges where you run more than one WU on the CPU at a time. (Or, switching to use all cores on one WU at the end may work equally well.)
  • Turning on hyper-threading may help.

What won't help (but won't hurt either):
  • A large amount of RAM.
  • Any Android devices.

What won't help (and will hurt, sort of):
  • Unstable processors. (Invalid work will be deducted! :eek: If Prime95 worked recently on your processor, it should be stable.)
  • Work not downloaded anduploaded within the challenge. (It's not counted.) Should you not be able to be in front of one or more computers at that time, there are several options:
    • You can often set BOINC's network connection preferences to wait until a minute or two after challenge time.
    • And for short work units, you can just set the queue level very low (0.01 days). This also makes it more likely that you will be a prime finder rather than a double-checker. But you might want to raise their queue size after the challenge is underway.

Welcome and good luck to all! :)

P.S. If no one has posted stats lately, try tracking your stats with my user script. With that installed, visit the current challenge's Team stats link for TeAm stats.
 
Last edited:

IEC

Elite Member
Super Moderator
Jun 10, 2004
14,328
4,913
136
Well my first two completed tasks are pending validation. And more tasks should finish today. So I should start getting points within the next day or so, I hope :)
 

Howdy

Senior member
Nov 12, 2017
572
480
136
Well my first two completed tasks are pending validation. And more tasks should finish today. So I should start getting points within the next day or so, I hope :)

On a PG challenge you receive the points when you have completed it and it's uploaded. At the end of the challenge when the "clean up" is completed, if you had any that were "invalid" the points will be taken away from you and the TeAm.
 

IEC

Elite Member
Super Moderator
Jun 10, 2004
14,328
4,913
136
On a PG challenge you receive the points when you have completed it and it's uploaded. At the end of the challenge when the "clean up" is completed, if you had any that were "invalid" the points will be taken away from you and the TeAm.

Well I certainly hope I don't have any that are invalid. My CPUs are all well-cooled and run at stock other than my 8700K which has been extensively stress-tested at its current clock.
 
  • Like
Reactions: Howdy

StefanR5R

Elite Member
Dec 10, 2016
5,498
7,786
136
Last edited:

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,242
3,830
75
@Howdy2u2 I'm on your six! :p
Speaking of which, day six stats:

Rank___Credits____Username
20_____2092052____xii5ku
23_____1983564____crashtech
27_____1719826____emoga
46_____931905_____Ken_g6
81_____540077_____iwajabitw
97_____435871_____IEC
98_____431670_____Howdy2u2
99_____430176_____zzuupp
180____117859_____biodoc
198____105394_____Orange Kid
208____58821______SlangNRox

Rank__Credits____Team
2_____26452880___Aggie The Pew
3_____20886716___SETI.Germany
4_____17564576___Sicituradastra.
5_____8847221____TeAm AnandTech
6_____8550047____Rechenkraft.net
7_____8125714____Crunching@EVGA
8_____5560192____The Knights Who Say Ni!

Now that's what I call a TeAm! In particular, this one goes to eleven (members)! :D
 

Howdy

Senior member
Nov 12, 2017
572
480
136
@Howdy2u2 I'm on your six! :p

Actual runtimes so far:
TR1950X: 13.78h, 20.74h
TR1920X: 19.70h
TR1900X: 32.20h
R7 1700: 26.30h
i7-8700K: 34.62h

Well, you certainly have me on CPU count your 1950 alone has me beat!!! Run to the top, iwajabitw and Ken_g6 have been cruising along all alone for quite some time.
 

StefanR5R

Elite Member
Dec 10, 2016
5,498
7,786
136
I'd say this race is more about RAM bandwidth than CPU count. Look how Zen based AMD CPUs easily play in the same league as Intel CPUs here, while the latter have a considerable advantage in other LLR based PG subprojects due to their wider vector units. But to keep these units busy, there needs to be enough cache, and none except some large server CPUs have enough (unified) cache for SOB-LLR's particular workload.
 
  • Like
Reactions: Ken g6

TennesseeTony

Elite Member
Aug 2, 2003
4,209
3,634
136
www.google.com
Would that be L2 cache or L3 cache?

For example the ThreadRipper 1950X has 8MB L2, 32MB L3.

While Xeon e5-2683-v3 shows 35MB of SmartCache (from Intel: CPU Cache is an area of fast memory located on the processor. Intel® Smart Cache refers to the architecture that allows all cores to dynamically share access to the last level cache. )
 

Howdy

Senior member
Nov 12, 2017
572
480
136
Well I'm probably beat in the RAM aspect too. Running 16gb per machine and not top of the line for speed either. Pretty much anything I am running is a swap over from Folding- GPU only folders- Pretty good GPU with not so grand CPU and RAM- Top them off with P55 series MOBOs up to a Z97 X99. (There is 1 CPU exception to this list that 2 folks are privy to ;))

With that said: I cringe when the challenge is CPU, and smile when it's GPU!!
 

IEC

Elite Member
Super Moderator
Jun 10, 2004
14,328
4,913
136
As far as RAM bandwidth... my i7-8700K uses 3600 CL15 RAM, so it has the best bandwidth and latency of any of my dual-channel rigs.

I have a Ryzen 7 1800X I set up last night with 2x4GB of DDR4-3733 Samsung E-die set to 3200 CL16. Despite only being 8GB, the expected task time is 26 hours. So it appears that Ryzen rigs can punch above their weight in this race.

The R7 1700 is using DDR-2400, so it should theoretically be my worst rig for bandwidth.

Threadripper rigs have quad-channel DDR4-3000 CL16 or better.
 

StefanR5R

Elite Member
Dec 10, 2016
5,498
7,786
136
Would that be L2 cache or L3 cache?
The FFT data alone is currently 20...22.5 MB per llrSOB task. So we are talking about L3 cache mostly, because L2 is so much smaller. (However, see below.)

For example the ThreadRipper 1950X has 8MB L2, 32MB L3.

While Xeon e5-2683-v3 shows 35MB of SmartCache (from Intel: CPU Cache is an area of fast memory located on the processor. Intel® Smart Cache refers to the architecture that allows all cores to dynamically share access to the last level cache. )
The Intel CPUs have (up to/ typically) 2.5 MB L3 cache per core (leaving Skylake-X/SP out of the picture, where they changed that a lot), but these L3$ segments are tightly connected by one or more ring buses. [Furthermore, it is "inclusive" cache, meaning the next cache level which is faster and closer to the processor cores (i.e. the L2$es) is always fully copied into the L3$.]

AMD's Zen based CPUs have (upt to/ typically) 2 MB L3 cache per core. 4x2 MB of this are very tightly connected segments which reside in the same "core complex". Then, 2x8 MB make up the total amount of L3$ of one Zeppelin chip, but these two 8 MB parts are connected via on-die Infinity Fabric. In case of Threadripper, there are two Zeppelin dies, and their L3$es are connected via on-package Infinity Fabric. So, in contrast to the Intel CPUs whose L3$ is, for all practical purposes, one single big piece per CPU, Zen's L3 is divided into 8 MB pieces of which only one piece is close to a given processor core, and the others are more or less further. [Furthermore, Zen's L3$ is "mostly exclusive", i.e. does generally not hold copies of the L2 caches. I am not sure though how the differences between inclusive and exclusive L3$ figures into workloads like multithreaded LLR.]

Now if we consider that Intel Haswell and later have roughly double the AVX execution width per core than AMD Zen, then vice versa AMD Zen has a lot more L3$ relative to AVX execution units. On the other hand, Zen's L3$ is divided into the mentioned 8 MB partitions.

In addition, Intel has 256 kB L$ per core (again, Skylake-X/SP excepted), compared to 512 kB L2$ per core in Zen. So there is roughly four times as much L2$ per AVX execution units in Zen. But here I am not sure whether or not this helps with what the LLR program is doing in particular (fast fourier transforms on rather large data).

As far as RAM bandwidth... my i7-8700K uses 3600 CL15 RAM, so it has the best bandwidth and latency of any of my dual-channel rigs.

I have a Ryzen 7 1800X I set up last night with 2x4GB of DDR4-3733 Samsung E-die set to 3200 CL16. Despite only being 8GB, the expected task time is 26 hours. So it appears that Ryzen rigs can punch above their weight in this race.

The R7 1700 is using DDR-2400, so it should theoretically be my worst rig for bandwidth.

Threadripper rigs have quad-channel DDR4-3000 CL16 or better.
Thanks, this is interesting.

Are you running llrSOB with 6 threads on the i7-8700K? I consider it possible that 5 threads, or maybe even 4 threads, would run even a slightly bit faster than 6 threads. If so, the reason for that would be that the memory interface already has a hard time to feed 4 cores worth of Skylake-type AVX units, i.e. can't feed more, and with more program threads there is more synchronization overhead than can be compensated by more arithmetic execution width. But whether 6 or 5 or 4 threads is the optimum for i7-8700K on dual-channel DDR4-3600c15 can only be found out by testing one and the same work unit with the different thread counts. --- Anyway, even if it may be possible to squeeze a little bit more out of this i7, it may not be all that much more.

So, maybe Ryzen and ThreadRipper do have the superior cache setup for llrSOB after all.

In my tests with Broadwell-EP, I found that llrSOB performance degrades only gradually (i.e. not steeply) when the number of concurrent llrSOB tasks begins to exceed what the processor cache can cover. This in turn lets me think that the segmentation of Zen's L3$ into 8 MB pieces may not be as big of a problem to llrSOB performance than one might think.
 

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,242
3,830
75
Day 7 stats:

Rank___Credits____Username
20_____2534563____xii5ku
22_____2416614____crashtech
26_____2109244____emoga
51_____1086929____Ken_g6
76_____707139_____Howdy2u2
84_____648152_____IEC
91_____598168_____iwajabitw
104____482446_____zzuupp
132____335881_____biodoc
204____105394_____Orange Kid
218____58821______SlangNRox

Rank__Credits____Team
2_____31855522___Czech National Team
3_____25406785___SETI.Germany
4_____21199429___Sicituradastra.
5_____11083357___TeAm AnandTech
6_____10496226___Rechenkraft.net
7_____10104020___Crunching@EVGA
8_____6728222____The Knights Who Say Ni!

Almost halfway! It looks like we'll be in a good spot if we can keep it up.
 

bill1024

Member
Jun 14, 2017
88
73
91
@Howdy2u2 I'm on your six! :p

Actual runtimes so far:
TR1950X: 13.78h, 20.74h
TR1920X: 19.70h
TR1900X: 32.20h
R7 1700: 26.30h
i7-8700K: 34.62h


Keep in mind there are two different sized tasks. Some are 2500k and others are 2800k FFT
So when comparing it has to be apples to apples. Also tasks being sent out now can be bigger than tasks sent out a week ago, depending if it is recycled or not. Tasks sent out last week my be recycled from a couple weeks ago...
You can see in the stderr file the size of the task.

I also see, for what ever reason, my x99 e5-1650v3 & i7-5930k are fastest of what I have.
Faster than my i7-8086k or i5-8600k and x79 3930k systems The CPU clocks and ram is faster on the coffeelake, yet they are slower. Not alot, but still....

My P2 lga 2011 e5-2670 stock speed, eight core, 1333 ram, are just a little bit faster than my overclocked x79 3930k with 1866 or faster ram. 2 extra cores make up the difference.

Also, I don't think these tasks are using AVX at all with Intel CPUs that have FMA3. Looking at my stderr txt it looks to me to be using FMA3, not AVX
Here is a part of it.

LLR command line: primegrid_cllr.exe -d -oDiskWriteTime=1 -oThreadsPerTest=12 llr.in
Using all-complex FMA3 FFT length 2560K, Pass1=640, Pass2=4K, 12 threads, a = 3
BOINC llr wrapper (version 8.00)
Using Jean Penne's llr (64 bit)
Primality test requested
LLR Program - Version 3.8.21, using Gwnum Library Version 28.14

EDIT" If the Intel CPU does not have FMA3 it uses AVX, just something to keep in mind when comparing This CPU to that CPU.

I wonder how well an AMD G34 with a 63xx CPU would do these.
High core count and they have FMA3 instructions.
I know AMD AVX was not up to par on their older CPus. I did hear that they fixed that with Zen2 though.
 
Last edited:

StefanR5R

Elite Member
Dec 10, 2016
5,498
7,786
136
My P2 lga 2011 e5-2670 stock speed, eight core, 1333 ram, are just a little bit faster than my overclocked x79 3930k with 1866 or faster ram. 2 extra cores make up the difference.
No, it's not the extra cores.
E5-2670: 20 MB L3 cache
i7-3930K: 12 MB L3 cache

For current llrSOB tasks with their >20 MB of hot data per task, this makes a huge difference. Processors which have considerably less than 20 MB ( x number of simultaneous tasks) of cache are waiting for RAM reads/writes a lot of the time.

PS,
on i5-8600K (8 MB cache and dual-channel RAM), GFN-21 was only able to scale to four of the six cores and their vector units. I suspect that llrSOB behaves very similarly.
 
Last edited:

bill1024

Member
Jun 14, 2017
88
73
91
One of my 2P 2670 has been blue screening 124 which can be anything, more likely hardware.
So I have seen it is often memory, I took out the DDR3-R 1333 8x4gb and put in 4 x 4 gskill sniper 1866 that I know is good mem.
It is not EECR server mem but the board can run it at 1866. Will see if that helps with the Blue Screen and overall time to run the task.
Yeah cache I am sure helps alot, so does staying on-line. I have two sys giving me a headache.
I swapped ram in the one and lowered the OC and up the CPU-V on the other. If they say on-line that would be nice!!

As far as dual channel ram, my x99 5930k has 2x8gb and is running dual channel the other 5930k is 4x4 quad channel, but the time per task seems to be equal. The 1650V3 is 2x8 for dual channel.
Time on that is equal to the others as well..
Also my x79 systems are a mix of dual and quad, all 3930k or E5-1650 not much difference there either.
Seems Ram speed and tight timings help more than dual/quad mem. Just my observance
 
  • Like
Reactions: TennesseeTony

biodoc

Diamond Member
Dec 29, 2005
6,261
2,238
136
Keep in mind the run times on the server are not accurate but the cpu times match the client run times perfectly.

To get accurate run times, go to the boinc directory and examine the job_log_www.primegrid.com.txt file. The cpu time is highlighted in red and the run time is highlighted in green in the first line. On the server, the run time is listed as 67,039 seconds for that task. That task is pretty close (server/client) in run time but others are quite different like the one just below.

server 75,617 client 68282

1547306741 ue 944952.350872 ct 637635.800000 fe 3315959167100600 nm llrSOB_284837634_3 et 66891.498101 es 0
1547307409 ue 944847.075373 ct 643339.200000 fe 3315589741853700 nm llrSOB_284837617_0 et 67573.751216 es 0
1547368033 ue 948016.262989 ct 574456.100000 fe 3326710828242600 nm llrSOB_284838205_1 et 61291.376617 es 0
1547369467 ue 947864.734969 ct 581279.700000 fe 3326179096956700 nm llrSOB_284838176_0 et 62056.735665 es 0
1547431706 ue 941323.854209 ct 581163.900000 fe 3303226306270300 nm llrSOB_284836989_2 et 62237.657867 es 0
1547436317 ue 950169.645106 ct 650330.800000 fe 3334267322667000 nm llrSOB_284838565_0 et 68282.075635 es 0
 

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,242
3,830
75
Day 8 stats:

Rank___Credits____Username
17_____3248325____xii5ku
21_____2926936____crashtech
29_____2219572____emoga
52_____1257856____Ken_g6
62_____1031736____Howdy2u2
66_____972223_____IEC
88_____708453_____iwajabitw
102____598889_____zzuupp
120____454118_____biodoc
203____158110_____Orange Kid
225____106227___10esseeTony
239____58821______SlangNRox

Rank__Credits____Team
2_____37760346___Czech National Team
3_____30217840___SETI.Germany
4_____24956061___Sicituradastra.
5_____13741271___TeAm AnandTech
6_____12071226___Rechenkraft.net
7_____11536601___Crunching@EVGA
8_____7816471____The Knights Who Say Ni!

We're starting to get a good lead on our nearest competition now. :)

With all the stats format changes, you wouldn't believe how hard it was to get @TennesseeTony into the stats properly! :p
 

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,242
3,830
75
Day 9 stats:

Rank___Credits____Username
16_____4017516____xii5ku
22_____3307539____crashtech
26_____2982176____emoga
53_____1417408____Ken_g6
56_____1367813____Howdy2u2
57_____1367758____IEC
90_____818822_____iwajabitw
114____598889_____zzuupp
116____566456_____biodoc
194____216451_____Orange Kid
199____212579___10esseeTony
231____111683_____SlangNRox

Rank__Credits____Team
2_____44290668___Czech National Team
3_____34728672___SETI.Germany
4_____28506463___Sicituradastra.
5_____16985096___TeAm AnandTech
6_____14503557___Rechenkraft.net
7_____13733734___Crunching@EVGA
8_____9149427____The Knights Who Say Ni!

"I - Eee! - see" you trying to pass me, but not yet. :)