PrimeGrid LLR Races Thread 2016

Page 6 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

TennesseeTony

Elite Member
Aug 2, 2003
4,209
3,634
136
www.google.com
Holy crap Michael nearly doubled his score again! Wow! :D

Very interesting race indeed!

Well done all, and thanks for the stats tracking Ken and Orange Kid.
 

Kiska

Golden Member
Apr 4, 2012
1,012
290
136
Great race! Thanks for the stats ken and orange kid.
Now that the race is over I am going to learn llr and try to write an app for gpu
 

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,242
3,829
75
Now that the race is over I am going to learn llr and try to write an app for gpu
:eek: I wouldn't write a new LLR app. You might look into making the existing LLR CUDA app work better. Or porting it to OpenCL. There are restrictions on the numbers it can work on that make it unsuitable for any project except maybe 321.
 

Kiska

Golden Member
Apr 4, 2012
1,012
290
136
:eek: I wouldn't write a new LLR app. You might look into making the existing LLR CUDA app work better. Or porting it to OpenCL. There are restrictions on the numbers it can work on that make it unsuitable for any project except maybe 321.

I am treating this as a new project for me. first year bachelor of computer science. It has to be original no porting allowed i get the remainder of the year to write it. But to write it I need to know the llr proof. I'll have a look at the cpu app though and see what its doing
 

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,242
3,829
75
OK, then, I wouldn't do LLR. It's not an easy problem to solve.

As you may know, I wrote a PrimeGrid GPU sieve application. But it only works on large ranges of K's. I've wanted to write an application that would work on individual K's, like sr2sieve, but for a GPU.

Now, here's how a fixed-K sieve works. sr2sieve uses baby-step giant-step to solve the discrete logarithm, which is fast, but takes a lot of memory, which is slow to access on a GPU. I was thinking of using Pollard Rho instead, as it uses little memory. The problem with Pollard Rho is its runtime is unpredictable, but I was thinking it could be divided into chunks, probably of close to the average runtime, and once a P is solved on one process it could be swapped for a new P.

The other problem with sieves is you need to do modular multiplication while almost never computing a modulus, if you want your program to run fast. I used Montgomery multiplication in my sieve, which I mostly understood. Another guy used Barret reduction in his factoring program, which is not a sieve. I never quite understood how to use it, but it seemed to require 128-bit numbers.

Overall, anything in this area seems like it should be a multi-year graduate project, not an undergraduate project. I believe llrCUDA was written by a math professor at a Tokyo university.
 

waffleironhead

Diamond Member
Aug 10, 2005
6,919
429
136
Dangit, waffleironhead, how do you keep beating me? I'm going to have to turn on another computer or two for the end of the race. :colbert:

Guess I should have stopped in and checked on things a bit during the race instead of working on my fences. I could have brought another system online and matched you. Oh well, the horses now have 2 more acres of pasture. :)
This is the closest we have been in output since I started participating in these races. :thumbsup:

Things were pretty toasty here during the race, so I wouldnt be surprised to lose a wu or two. This old athlonx2 errors hard when cpu temps pass 60c and things were hovering close the the limit the whole race.
 

Kiska

Golden Member
Apr 4, 2012
1,012
290
136
OK, then, I wouldn't do LLR. It's not an easy problem to solve.

As you may know, I wrote a PrimeGrid GPU sieve application. But it only works on large ranges of K's. I've wanted to write an application that would work on individual K's, like sr2sieve, but for a GPU.

Now, here's how a fixed-K sieve works. sr2sieve uses baby-step giant-step to solve the discrete logarithm, which is fast, but takes a lot of memory, which is slow to access on a GPU. I was thinking of using Pollard Rho instead, as it uses little memory. The problem with Pollard Rho is its runtime is unpredictable, but I was thinking it could be divided into chunks, probably of close to the average runtime, and once a P is solved on one process it could be swapped for a new P.

The other problem with sieves is you need to do modular multiplication while almost never computing a modulus, if you want your program to run fast. I used Montgomery multiplication in my sieve, which I mostly understood. Another guy used Barret reduction in his factoring program, which is not a sieve. I never quite understood how to use it, but it seemed to require 128-bit numbers.

Overall, anything in this area seems like it should be a multi-year graduate project, not an undergraduate project. I believe llrCUDA was written by a math professor at a Tokyo university.

:eek: This may seem like a surprise but my math programming professor alternating with the programing prof, has started teaching us Lucas lehmer this term, he doesn't expect us to do llr, but if we do llr then he doesn't care how long it runs for, just that the code is readable. Then if we did decide to do llr, next year we would begin with optimisations. So all that is required is functional and operational code, there is no need for functional and ru the quickest that is for next year
 

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,242
3,829
75
Oh! I wasn't paying attention. I thought the race started August 5!

Also, it's not an LLR race.
 

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,242
3,829
75
Time for a bump - the next race is an LLR race.
 

TennesseeTony

Elite Member
Aug 2, 2003
4,209
3,634
136
www.google.com
Starts this Friday, 2-7 September 18:00:00 ESP-Sieve LLR Summer Paralympics Challenge 5 days.

And as a note to myself: Tony, that's 2pm EST. o_O

EDIT: Update your signature Ken. :)
 

Kiska

Golden Member
Apr 4, 2012
1,012
290
136
I don't think I'll be participating at all in this race. 195k seconds for one unit Sigh
 

TennesseeTony

Elite Member
Aug 2, 2003
4,209
3,634
136
www.google.com
Oh come on Kiska, you'll finish one round of tasks with 19 hours to spare! :D

My one test task finished in just under 59k seconds, hyper-threaded, on a i7-5820K. I won't be able to use that system though due to AVX and overclocking. My motherboard is capable of feeding 300W to the CPU socket, and gladly does so on multiple AVX tasks!! :eek:
 

Kiska

Golden Member
Apr 4, 2012
1,012
290
136
Oh come on Kiska, you'll finish one round of tasks with 19 hours to spare! :D

My one test task finished in just under 59k seconds, hyper-threaded, on a i7-5820K. I won't be able to use that system though due to AVX and overclocking. My motherboard is capable of feeding 300W to the CPU socket, and gladly does so on multiple AVX tasks!! :eek:

How about you get yelled at by your data center provider for heating up your area to unbearable temperatures? :p But all in all, 4 tasks ~195k seconds with 7 its going to be longer and I am doing extensive manual testing for seti, that seti will eat into the times for primegrid.
 

TennesseeTony

Elite Member
Aug 2, 2003
4,209
3,634
136
www.google.com
Only 105 minutes until the start.

I'm getting a lot of LHC tasks right now, they are a priority for me (personal goals), and they are sporadic, so.....I'm debating participating in the PG race... I'm sure I will participate some, but....well, debating.

Edit: I guess for now I've decided to allow one round of tasks for Bee and WUSS.
 
Last edited:

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,242
3,829
75
Still having trouble with the forums SSL. I'm not sure I can post stats anywhere near on time. But it looks like I'm doing tasks on my i3 in less than 12 hours each. So I'm expecting decent output. :)
 

TennesseeTony

Elite Member
Aug 2, 2003
4,209
3,634
136
www.google.com
Holy Crapoly. Not even 12 hours into the race yet and not only have many people scored already (Ken being one of them), but one of them already has over 100,000 points (~23 completed tasks)!
 

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,242
3,829
75
Day 1 stats, only slightly late:

Rank___Credits____Username
65_____27961______Ken_g6
113____17476______waffleironhead

Rank__Credits____Team
15____64702______Canada
16____53460______Rechenkraft.net
17____45771______PrimeSearchTeam
18____45437______TeAm AnandTech
19____44276______US Navy
20____37562______Alien Prime Cult
21____36379______Special: Off-Topic

Holy Crapoly. Not even 12 hours into the race yet and not only have many people scored already (Ken being one of them), but one of them already has over 100,000 points (~23 completed tasks)!
It's all about the memory bandwidth. :)
 

VirtualLarry

No Lifer
Aug 25, 2001
56,326
10,034
126
I've got two tasks "in the oven", 11hr completed, 17hr to go on them. On my Skylake G4400 @ 4.455. (My Skylake i3-6100 is in an ITX box, that isn't plugged in right now. Not sure if the ITX box can take the heat, even if I did plug it in. Maybe I should.)
 

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,242
3,829
75
Don't overclock the i3 if you use it. AVX works better when you don't. Edit: Maybe use just one core of you're worried about heat.
 
Last edited: