PrimeGrid Challenges 2021

Page 13 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,555
14,511
136
Remember,
in order to apply the new threadcount setting to an existing task,
  • uncheck "leave non-GPU tasks in memory while suspended",
  • let the client read the new threadcount/task setting; i.e.
    if using app_config.xml, trigger "Read config files" in boincmgr;
    if using the web setting, trigger a project update,
  • suspend the task which shall be changed,
  • wait ten seconds or so
    (or check with "ps auxf" if the respective child process of boinc went away),
  • resume the task,
  • watch the task in "top" or another process monitor for whether or not it reaches circa the expected CPU utilization.
Well, unfortunately, all this is a few minutes late, but thanks. I now have ONE task with all 16 CPU's (no SMT), 30 minutes elapsed, and 11:15 ETA. If I had 15 hours, then I will make it. If I left everything alone, I would have had 2 more tasks.

Oh well....
 

crashtech

Lifer
Jan 4, 2013
10,524
2,111
146
Remember,
in order to apply the new threadcount setting to an existing task,
  • uncheck "leave non-GPU tasks in memory while suspended",
  • let the client read the new threadcount/task setting; i.e.
    if using app_config.xml, trigger "Read config files" in boincmgr;
    if using the web setting, trigger a project update,
  • suspend the task which shall be changed,
  • wait ten seconds or so
    (or check with "ps auxf" if the respective child process of boinc went away),
  • resume the task,
  • watch the task in "top" or another process monitor for whether or not it reaches circa the expected CPU utilization.
I haven't been doing that. I've been suspending the task, then restarting the client. I'm wondering if I've been losing a checkpoint that way. It seemed to work so I never thought twice about it.
 

StefanR5R

Elite Member
Dec 10, 2016
5,510
7,817
136
I've been suspending the task, then restarting the client.
This has the very same result.
(Except that it affects all tasks in the client, not just one task or a selection of tasks.)

I'm wondering if I've been losing a checkpoint that way.
AFAICT, LLR2 saves and restores a very current state when it is suspended and resumed.

Edit, I have boinc client set to "request tasks to checkpoint at most every 600 seconds". From what I have seen, this coincides with the checkpointing interval of the various LLR based subprojects at PrimeGrid.

The problem with just throwing more cores at one task is that there will or may be inter-CPU or inter-CCX communication which carries a performance hit. I may try it anyway just to see what happens.
Before the challenge I tested my own computers only with configurations in which tasks are not crossing sockets (assuming Linux's scheduler being in steady state), and in which tasks are spread over just 1 or 2 CCXs. Should be worthwhile to test with more threads sometime later, to satisfy curiosity.
 
Last edited:
  • Like
Reactions: crashtech

crashtech

Lifer
Jan 4, 2013
10,524
2,111
146
Well, my dual E5-2690v4 task did not make it under the wire. Going from 14 to 28 cores took the runtime from ~18 to ~13 hours, a terrible use of resources. It had 34 minutes left to go when the contest ended...
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,555
14,511
136
Well, my dual E5-2690v4 task did not make it under the wire. Going from 14 to 28 cores took the runtime from ~18 to ~13 hours, a terrible use of resources. It had 34 minutes left to go when the contest ended...
my 16 did it in 11.5, not sure if it made it or not.
 

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,250
3,845
75
More-or-less final stats:

Rank___Credits____Username
7______17696795___xii5ku
19_____6417851____crashtech
22_____6074168____Icecold
73_____1787331____Orange Kid
85_____1524313____markfw
86_____1514039____Ken_g6
96_____1339156____emoga
144____797248_____Endgame124
150____706803_____waffleironhead
154____694382_____biodoc
187____476570_____SlangNRox
232____310986_____Justinus

Rank__Credits____Team
4_____62360806___Antarctic Crunchers
5_____45944300___AMD Users
6_____45234504___Aggie The Pew
7_____39186472___TeAm AnandTech
8_____18446474___BOINC@MIXI
9_____15707051___Storm
10____15367298___Sicituradastra.

missed-it.jpg


I missed @Markfw by "that much", and we missed AMD Users and Aggie The Pew by "that much".
 

StefanR5R

Elite Member
Dec 10, 2016
5,510
7,817
136
Thanks for the stats, @Ken g6!

Good job by the team known as T44T for one half of the challenge, and as 7AA7 for the other half. ;-)

I usually prefer when things develop the other way around, but in the bigger picture I feel we made good use of these past ten days. :-)

Also, I liked how the older ones among my home computers made themselves extraordinarily useful in this race.
 

crashtech

Lifer
Jan 4, 2013
10,524
2,111
146
Yeah, the ol' Xeons are pretty happy with these giant tasks! I suppose I'll keep them around even though my desire to run them all the time is getting pretty expensive.
 
  • Like
Reactions: Icecold

crashtech

Lifer
Jan 4, 2013
10,524
2,111
146
Looks like even my lowly dual Xeon E5440 rig should be able to finish a few WUs in three days!
 

StefanR5R

Elite Member
Dec 10, 2016
5,510
7,817
136
My computers¹ are weird:
They kept requesting more work even though I had set them to 0 or 0.01 days deep buffer initially. I therefore went to the PrimeGrid web preferences and configured a limit on tasks in progress by means of the "Max # of simultaneous PrimeGrid tasks" setting.

¹) two Mint 20 computers, client version 7.16.6, PrimeGrid profile is set to accept WW Nvidia GPU work only; the computers never ran WW before, their initial estimation of the task durations were ~5h on one and ~2h on the other --> therefore they should have stopped to ask for more work once they received the first work
 

Assimilator1

Elite Member
Nov 4, 1999
24,120
507
126
Thanks for the tip :) , but I'm not really into maths projects, might fire up Einstein, it's been a while (and F@H has stalled again).
 

StefanR5R

Elite Member
Dec 10, 2016
5,510
7,817
136
Isn't credit for WW a little bit inflated, compared with PG's other prime-finding subprojects? (Ah, right, there was this optimization which suddenly made the application so much more productive.)

It's fine with me though – then my first three days at WW won't leave a mess in the colors of my PG badges… for now.
 

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,250
3,845
75
Day 1 stats:

Rank___Credits____Username
27_____10080000___emoga
32_____8676000____biodoc
47_____6228000____crashtech
56_____4488000____Orange Kid
57_____4224000____xii5ku
80_____2772000____The Great Cornholio
102____1848000____Lane42
125____1356000____SlangNRox
222____420000_____Ken_g6
260____216000_____waffleironhead
364____12000______geecee

Rank__Credits____Team
4_____82104000___Aggie The Pew
5_____61416000___Metal Archives
6_____41340000___Microsoft
7_____40320000___TeAm AnandTech
8_____38496000___Storm
9_____34944000___Save The World Real Estates
10____34212000___AMD Users

There's something about my video cards that doesn't work well on this. My 1060 is one quarter the speed of @SlangNRox's 1660. o_O But nearly twice the speed of his 1050.
 

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,250
3,845
75
Day 2 stats:

Rank___Credits____Username
46_____12936000___crashtech
49_____12420000___biodoc
50_____11988000___xii5ku
51_____11940000___emoga
57_____9120000____Orange Kid
83_____5664000____The Great Cornholio
103____3888000____Lane42
129____2736000____SlangNRox
221____936000_____Ken_g6
284____372000_____waffleironhead
389____48000______geecee

Rank__Credits____Team
5_____128880000___Metal Archives
6_____85716000___Storm
7_____84036000___Microsoft
8_____72048000___TeAm AnandTech
9_____71100000___Save The World Real Estates
10____56988000___AMD Users
11____56856000___UK BOINC Team

Can we save ourselves from Save The World Real Estates?
 

StefanR5R

Elite Member
Dec 10, 2016
5,510
7,817
136
My computers are weird:
They kept requesting more work even though I had set them to 0 or 0.01 days deep buffer initially. I therefore went to the PrimeGrid web preferences and configured a limit on tasks in progress by means of the "Max # of simultaneous PrimeGrid tasks" setting.
So I had been running this with a short queue, but shouldn't have. My internet link is just too unreliable. >:-(

All was good until yesterday. When I was reading news last night, I noticed the connection went down and I corrected it right away by resetting the modem. However, immediately after I went to sleep, the connection went down again. (Strangely enough, it came back by itself at 5AM but went down again after half an hour.)

It's not only the production loss — I also had to have windows open, although we have night frost now. Luckily my emergency heating, consisting of a bunch of Climateprediction.net tasks, prevented possible damage.

I now set 0.20 + 0.01 days work buffer locally, and switched "Max # of simultaneous PrimeGrid tasks" back to Unlimited. — The effect: The computers are again requesting far more work than they are supposed to. I am puzzled why they do it.

Hmm, does the client (here: version 7.16.6) perhaps base its workqueue estimation on CPU time (42 seconds) instead on runtime (21 minutes)?