New Linpack Stress Test Released

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Nov 3, 2007
28
3
91
#26
v0.9.2
- Added several optimizations for AMD CPUs.
- Improved multithreading efficiency for the benchmark.
- Fixed insufficient memory error on 32-bit systems.
- Updated CPUID HWMonitor to version 1.36.
- Some minor changes.
 

crashtech

Diamond Member
Jan 4, 2013
8,770
42
126
#28
I may do some testing when I have time. Thanks for providing this interesting new tool. At first blush the flops (and CPU usage) seem low on my Ryzen 5 1600, but I'll have to use it more and pit it against some Intel stuff to get comparative results.
 

.vodka

Golden Member
Dec 5, 2014
1,015
9
136
#29
Can someone test v0.9.2 on Zen and report back?
The unofficial Korean LinX version I linked a few posts back is still faster on Zen. Your patched linpack binaries in v0.9.2 aren't using AVX/2 on Zen for some reason.
 
Nov 3, 2007
28
3
91
#30
With version 0.9.3, AMD Ryzen users can now bake their CPUs like never before. Thanks to .vodka for beta testing.
 
Last edited:
Feb 19, 2017
63
3
61
#31
With version 0.9.3, AMD Ryzen users can now bake their CPUs like never before. Thanks to .vodka for beta testing.
I'm going to bake my TR and find out how stable my stable setting is :)
 
Nov 3, 2007
28
3
91
#32
A new version is now available.

Both AMD and Intel users should get better scores when benchmarking since Linpack now runs only on real cores.

Threads will now spawn in a more balanced manner when stress testing. Most useful for non-Intel CPUs and MP/DP systems.
 
Apr 16, 2014
179
26
101
#33
is there a way of specifying the # of threads prior to the run? On my quad-opteron (48 core) this thing only spawns 12 threads during the benchmark.
 
Nov 3, 2007
28
3
91
#34
is there a way of specifying the # of threads prior to the run? On my quad-opteron (48 core) this thing only spawns 12 threads during the benchmark.
Thanks for the report. Fixing it right now.

EDIT:
Please try v0.9.6 from this link.
 
Last edited:
Nov 3, 2007
28
3
91
#36
This is the "problem" with MP/DP rigs running Windows. A lot of apps and games use only one CPU because Windows categorizes each CPU separately.
 
Apr 16, 2014
179
26
101
#37
Interesting, i rebooted and in my SuperMicro BIOS set things to node interleaving On. (ie non NUMA mode - evens out memory latency to all sockets). - 187 GFlops.

With the BIOS set to NUMA mode (ie node interleaving Off)... results were - 148.xx GFlops.

i normally have it set to NUMA for exactly the reasons you mention. i like to have most games sit on socket 1 and use the memory attached to that socket... better latency = higher fps.

Linpack seems to like more memory bandwidth however.

This is with 4 Opteron 61xx series (12 K10 cores each) running at 3.0 GHz. Unfortunately i can't try running at 3.3 since these 92mm Noctuas won't cool fast enough :(

This is the "problem" with MP/DP rigs running Windows. A lot of apps and games use only one CPU because Windows categorizes each CPU separately.
 
Oct 9, 1999
13,237
36
126
#38
I did the 100 loop test with 9.6GB allocated. 9900K with 4.8GHz on all cores. Max temp hit upper 80s on some of the cores. It passed.
 
Oct 9, 1999
13,237
36
126
#39
Ran the benchmark to see what I'd get. Only 228 GFlops. Running the stress test with 8 threads and 9.6GB allocated yields this.
 
Nov 3, 2007
28
3
91
#40
The benchmark results are comparable with other people. Ryzen 2700X is getting around 180 GFlops.
 

ehume

Golden Member
Nov 6, 2009
1,299
3
91
#41
The benchmark results are comparable with other people. Ryzen 2700X is getting around 180 GFlops.
I would be interested to see someone with an i7 8700k's Linpack Extreme results overclocked. I routinely ran LinX 0.6.5 (Linpack with AVX2 front end) at 5 GHz, but I did not keep track of the Gflops (I was testing heatsinks). I wonder . . .
 
Apr 16, 2014
179
26
101
#42
Its a bit strange. K10 is supposedly capable of 4 DP per cycle - "4 DP FLOPs/cycle: 2-wide SSE2 addition + 2-wide SSE2 multiplication". Yet i am getting nowhere near that.

i'm running 48 K10s at 3.0 GHz... my performance should be approximately 2.5 to 3x what it is (ie at least in the high 400s).

https://www.degruyter.com/downloadpdf/j/comp.2013.3.issue-1/s13537-013-0101-5/s13537-013-0101-5.pdf

^That is exactly what I would expect. 86-87% efficiency, 8 DP ops/clock, 8 cores, 3.8 GHz. Multiply all of it together and it comes to ~210 GFLOP/s.
 
Last edited:

tamz_msc

Platinum Member
Jan 5, 2017
2,148
86
106
#43
Its a bit strange. K10 is supposedly capable of 4 DP per cycle - "4 DP FLOPs/cycle: 2-wide SSE2 addition + 2-wide SSE2 multiplication". Yet i am getting nowhere near that.

i'm running 48 K10s at 3.0 GHz... my performance should be approximately 2.5 to 3x what it is (ie at least in the high 400s).

https://www.degruyter.com/downloadpdf/j/comp.2013.3.issue-1/s13537-013-0101-5/s13537-013-0101-5.pdf
Maybe it still needs optimization for older architectures, or this is still not optimized for OpenMP?
 
Nov 3, 2007
28
3
91
#44
Final version is now available. Addressed some of the things all of you mentioned here.
 
Last edited:
Apr 16, 2014
179
26
101
#45
new version seems to stress things a bit more and produces marginally better result.

48 K10 cores at 3.0 GHz.

187 GFlops NUMA mode (node interleaving off) up from 148.

250 GFlops non NUMA mode (node interleaving on) up from 187
 
Jul 25, 2014
140
0
101
#46
This consistently reports a hardware failure within 10-15 minutes on my 8700K when running with 12 threads, seemingly regardless of processor settings. An OC which will happily run the latest Prime95 all day long? Failed. Stock? Failed. An anemic 3GHz at a toasty 1.45 volts? Failed. At this point I figure it must be some sort of bug with the program's error detection.

This behavior disappears when running with the same number of threads as physical cores (6), at which point it seems to be stable at the same settings as the latest Prime95.
 

ehume

Golden Member
Nov 6, 2009
1,299
3
91
#47
This consistently reports a hardware failure within 10-15 minutes on my 8700K when running with 12 threads, seemingly regardless of processor settings. An OC which will happily run the latest Prime95 all day long? Failed. Stock? Failed. An anemic 3GHz at a toasty 1.45 volts? Failed. At this point I figure it must be some sort of bug with the program's error detection.

This behavior disappears when running with the same number of threads as physical cores (6), at which point it seems to be stable at the same settings as the latest Prime95.
I don't know about this new Linpack, but LinX 0.6.5 runs a 12-threaded 8700K for 30min, with no sign of shutting down. I used to run this over and over at various OC's, up to 5 GHz.
 
Nov 3, 2007
28
3
91
#49
This consistently reports a hardware failure within 10-15 minutes on my 8700K when running with 12 threads, seemingly regardless of processor settings. An OC which will happily run the latest Prime95 all day long? Failed. Stock? Failed. An anemic 3GHz at a toasty 1.45 volts? Failed. At this point I figure it must be some sort of bug with the program's error detection.

This behavior disappears when running with the same number of threads as physical cores (6), at which point it seems to be stable at the same settings as the latest Prime95.
Can you post a screenshot?
 
Oct 9, 1999
13,237
36
126
#50
This consistently reports a hardware failure within 10-15 minutes on my 8700K when running with 12 threads, seemingly regardless of processor settings. An OC which will happily run the latest Prime95 all day long? Failed. Stock? Failed. An anemic 3GHz at a toasty 1.45 volts? Failed. At this point I figure it must be some sort of bug with the program's error detection.

This behavior disappears when running with the same number of threads as physical cores (6), at which point it seems to be stable at the same settings as the latest Prime95.
I can confirm the same behavior on my system.

I've been playing with reduced memory timings for my memory and used Prime95 with 24GB allocated over night for 8 hours and a few hours the night before. I then ran AIDA64 stress test for a few hours today while doing other things. Everything passed. I then decided to try this new "Linpack Xtreme" test again for good measure with all threads. Previously, I've been using 8 threads since that's normally how you would use a Linpack test. It failed quickly with 16 threads. After reading your post I decided to underclock everything and run the test again. It failed the same way. The test failed at 3.8GHz (with all cores utilized) and with the memory at only 2133 with very loose timings. It should pass any test.



This utility seemed like a great thing at first. I'm going to have to stop using it and any future release even if there is a fix. There is now huge doubt about its functionality. A stability test should only fail when there is indeed a valid failure. It should not be producing a false positive for instability. I'd recommend others also not use it.
 


ASK THE COMMUNITY

TRENDING THREADS