- May 16, 2002
- 26,773
- 15,795
- 136
So, its not even OC'ed hardly at all until I get my custom water loop done, but its getting about 1000 ppd on Rosetta. What do you think ?
Could also be memory errors, e.g. too high memory clock (or too tight timings), especially if these are 2 DIMMs per channel; less likely defects of the new RAM sticks. @Markfw, have you run a memcheck on the new sticks yet?You have 254 valid work units but 115 have failed due to compute errors. I would reduce the overclock to see if you can eliminate the future compute errors.
It's more than 1,000 PPD, and it really should be. (Edit, or did you mean PPD per thread?)So, its not even OC'ed hardly at all until I get my custom water loop done, but its getting about 1000 ppd on Rosetta. What do you think ?
An open question is how well it works as GPU driver. IIRC, @TennesseeTony had lower GPU performance in a dual socket Xeon host, in comparison to single socket hosts. 2990WX could have similar issues, maybe more so on Windows compared to Linux. (Or maybe not at all on Linux?)I will consider replacing all my Ryzen rigs with a single 2990WX for density reasons.
I have been thinking about it, but can't recall a project which is a perfect benchmark. Most DC projects have a high variability of the computational workload per WU. And on top of that, credit estimation can be very random at times... Besides, there are projects with different performance on Linux and Windows. (Typically better on Linux, but in some cases the other way around.)Hey, guys what benchmarks do you want to see out of my 2990WX ? I only had one request, and for blender. I did that, so now I need more input.
Thread- dual E5- dual E5- dual E5- dual E5-
ripper 2690 v4 2690 v4 2696 v4 2696 v4
2990WX 3.2 GHz 3.2 GHz 2.8 GHz 2.8 GHz
------------------------------------------- per-thread performance -----------------------------------------
gene@home PC-IM 1.10 x86_64-pc-linux-gnu (avx) 5.85 4.63 4.90 4.28 4.35 GFLOPS
gene@home PC-IM 1.10 x86_64-pc-linux-gnu (sse2) 5.94 4.85 4.81 3.89 3.93 GFLOPS
gene@home PC-IM 1.10 x86_64-pc-linux-gnu (fma) 5.83 4.75 4.56 4.11 4.06 GFLOPS
--------------------------------------------- per-host performance -----------------------------------------
number of processors 64 56 56 88 88
gene@home PC-IM 1.10 x86_64-pc-linux-gnu (avx) 374 259 274 377 383 GFLOPS
gene@home PC-IM 1.10 x86_64-pc-linux-gnu (sse2) 380 272 269 342 346 GFLOPS
gene@home PC-IM 1.10 x86_64-pc-linux-gnu (fma) 373 266 255 362 357 GFLOPS
Thread- dual E5- dual E5- dual E5- dual E5-
ripper 2690 v4 2690 v4 2696 v4 2696 v4
2990WX 3.2 GHz 3.2 GHz 2.8 GHz 2.8 GHz
------------------------ per-thread performance ------------------------
run time/task 11,657 13,680 13,577 15,464 14,956 s
(CV) (0.07) (0.01) (0.01) (0.07) (0.10)
credits/task 158 152 153 150 147
(CV) (0.07) (0.03) (0.03) (0.07) (0.08)
-------------------------- per-host performance ------------------------
# of processors 64 56 56 88 88
PPD 74,725 53,832 54,347 73,740 74,858
The host under Windows: ID 368169 (average processing rate 29 GFLOPS, after >800 tasks)sompe of P3D said:Cosmology seems to have quite a problem with the topology of the Threadripper 2990WX. At least I have been struggling with aborts due to runtime overruns.
After I disabled SMT to rule out its influence, a fairly clear picture seems to emerge as half of the WUs seem to need much more computing time and roughly run about 30% longer, unless the slower WUs jump over to the directly connected dies after the faster WUs finished. If this adds up with the slowdown by SMT, the WUs are apparently automatically shot down after too long computing time and marked as faulty. Here, the disadvantage of the indirectly connected memory seems to be in full effect.
Currently running the last WUs under Windows (limited to 4 cores per WU) and later continue my attempts with Ubuntu, which was the intended OS for this anyway.
How many threads are each of those two boxes of yours ? Trying to compare to the 2990wxA Planet3DNow! user is testing the waters with a 2990WX on Windows and Linux with the multithreaded Cosmology@home camb_boinc2docker application. He wrote in their German forum (machine-translated and edited):
The host under Windows: ID 368169 (average processing rate 29 GFLOPS, after >800 tasks)
The host under Linux: ID 362903 (average processing rate 39 GFLOPS, after 17 tasks, presumably with SMT on)
Like user sompe, I run camb_boinc2docker with 4 threads per task.
E5-2690 v4, HT on, Linux: average processing rate 29 GFLOPS
E5-2696 v4, HT on, Linux: average processing rate 23 GFLOPS
<app_config>
<app_version>
<app_name>camb_boinc2docker</app_name>
<plan_class>vbox64_mt</plan_class>
<avg_ncpus>4</avg_ncpus>
</app_version>
</app_config>
I'm not sure about used Xeon E5 v3 (22 nm, less efficient, therefore I haven't watched these).@StefanR5R, so I guess you like the 2990wx and think its cheaper than a used Xeon E5 setup ?
Two thoughts:I just spec'ed out a 128 thread, dual EPYC system after I saw that article that said EPYC 16 core demolishes TR 16 core by 62%, and it is $10,000 for 2 cpu's, motherboard and 128 gig of 2666 registered and ECC memory and 2 Noctura heatsinks....... Tempting.