Zen 2 for Distributed Computing: Any interest?

Page 11 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,541
14,495
136
Well, I can say this, I won't have a cooling problem with this 64 core setup, or when I get to 128 core. The memory is not in it, but the rest is done !

Yes, those are DUAL NH-U14S-SP3-TR4's ! And YES that is a MP600 One terrabyte NVME ! Now I just need to add the 256 GIG of DDR4 ECC registered.

z53xozp.jpg
 
Last edited:

StefanR5R

Elite Member
Dec 10, 2016
5,498
7,786
136
On socket 2011-3, I had he opportunity to use the NH-D15S dual-tower coolers, and of course used it because I like big coolers. :-)
An NH-D15 successor which fits on socket SP3 is expected to be released some time next year.
 
  • Like
Reactions: ao_ika_red

biodoc

Diamond Member
Dec 29, 2005
6,261
2,238
136
I wonder how an EPYC™ 7302P (3 GHz base clock, 16 cores/32 threads, 128 MB L3 cache, 8 channel RAM, TDP 155 watts) would compare with the 3950X (3.5 GHz base clock, 16 cores/32 threads, 64 MB L3 cache, 2 channel RAM, TDP 105 watts) for DC computing. The reason I'm asking is this CPU/MB combo price caught my eye some time back. In the end, I went for a 3900X build instead. I'm wondering if the 8 channel RAM on the EPYC chip would have a positive impact on one or more of the boinc projects.
 

StefanR5R

Elite Member
Dec 10, 2016
5,498
7,786
136
Revisiting my thought experiment of using four 16-core Ryzen computers to build an equivalent of one 64-core Epyc computer:
  • A mere thought experiment this will remain for the time being. It is reportedly far easier to go and buy four 64-core Epycs than one 16-core Ryzen.
  • The recent experience with the Rosetta based Microbiome Immunity Project left me wondering whether it wouldn't be wiser to go with 32 cores per 8 RAM channels, rather than 64 cores per 8 RAM channels.
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,541
14,495
136
Revisiting my thought experiment of using four 16-core Ryzen computers to build an equivalent of one 64-core Epyc computer:
  • A mere thought experiment this will remain for the time being. It is reportedly far easier to go and buy four 64-core Epycs than one 16-core Ryzen.
  • The recent experience with the Rosetta based Microbiome Immunity Project left me wondering whether it wouldn't be wiser to go with 32 cores per 8 RAM channels, rather than 64 cores per 8 RAM channels.
Well, its installing windows now. Do I have 16 channels of memory with 2 8 channel EPYC's ? Anyway, SOON I can banchmark some stuff. The software has to be free or trial. All I can really do is cinebench like stuff.
 

StefanR5R

Elite Member
Dec 10, 2016
5,498
7,786
136
Here is a review of a single-socket mainboard for the affordable EPYC -P series of processors:
https://www.servethehome.com/supermicro-h11ssl-nc-rev-2-atx-amd-epyc-motherboard/

It is still a PCIe v3 board, and Patrick of STH comments on the cost delta between PCIe v4 : v3 mainboards.

They ran some of their performance tests with the whole gamut of EPYC Rome processors, from 8 to 64 cores:
page 3 of the H11SSL review​
Here are the same tests in their Threadripper 3970X/ 3960X review for direct comparison:
page 5 of the TR3 review​

(The 32-core 280-W 3970X's performance is between the 32-core 180-W 7502P and the 64-core 200-W 7702P.
The 24-core 280-W 3960X performs strikingly similar to the 32-core 180-W 7502P.)
 
  • Like
Reactions: biodoc

biodoc

Diamond Member
Dec 29, 2005
6,261
2,238
136
Its hard for me to understand the results. Why can't they have a chart at the end with the performance per watt ? Pretty sure Rome blows naples out of there, but its not easy to see,

The info is there for each individual benchmark. The first chart is performance, the next chart is power draw at the wall and the 3rd chart is performance per watt. The rankings vary somewhat depending on the benchmark but it's clear Rome is better at perf/watt.
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,541
14,495
136
The info is there for each individual benchmark. The first chart is performance, the next chart is power draw at the wall and the 3rd chart is performance per watt. The rankings vary somewhat depending on the benchmark but it's clear Rome is better at perf/watt.
I must be blind.... I looked again, and NO chart says anything about performance/watt
 

biodoc

Diamond Member
Dec 29, 2005
6,261
2,238
136
Performance per watt per frame (more is better). Small print just above top bar.

1575674939953.png
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,541
14,495
136
Well, my 7601's sure suck.....

Long live the 7742's ! (that I badly want)
 

StefanR5R

Elite Member
Dec 10, 2016
5,498
7,786
136
Thanks! These are at-the-wall power measurements in a 1P mainboard (ASRockRack EPYCD8 + RAM, SSD, using Gb-Ethernet), PSU spec not provided. Also included in some of the tests: core clocks and temperatures (while cooled by the Noctua 92 mm tower cooler).

The tests are with workloads which generally seem to scale well to any number of cores; with the exception of page 8 where single-threaded tests were run, which among else tells us power consumption when the 1P server is severely under-utilized. (The molecular dynamics benchmark at the bottom of the page is again a highly threaded test.)

The figures confirm my earlier thought that a substantial upgrade from high-core-count 14 nm server CPUs, if one is power- and/or cooling-constrained, is provided only by the 64 core SKUs. (And the 48 core SKUs to a lesser degree, but there is no 48 core model in the cheaper single-socket-only -P lineup. Also, they have 3-core CCXs which can be a concern in multithreaded workloads, as opposed to multi-process workloads.)

Small print just above top bar.
They didn't make that easy to see,
The small print isn't as small if a large enough minimum font size is configured in the browser.

Long live the 7742's !
Or 7702P at 2/3rds the price for the processor. Still pricey though.
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,541
14,495
136
Thanks! These are at-the-wall power measurements in a 1P mainboard (ASRockRack EPYCD8 + RAM, SSD, using Gb-Ethernet), PSU spec not provided. Also included in some of the tests: core clocks and temperatures (while cooled by the Noctua 92 mm tower cooler).

The tests are with workloads which generally seem to scale well to any number of cores; with the exception of page 8 where single-threaded tests were run, which among else tells us power consumption when the 1P server is severely under-utilized. (The molecular dynamics benchmark at the bottom of the page is again a highly threaded test.)

The figures confirm my earlier thought that a substantial upgrade from high-core-count 14 nm server CPUs, if one is power- and/or cooling-constrained, is provided only by the 64 core SKUs. (And the 48 core SKUs to a lesser degree, but there is no 48 core model in the cheaper single-socket-only -P lineup. Also, they have 3-core CCXs which can be a concern in multithreaded workloads, as opposed to multi-process workloads.)



The small print isn't as small if a large enough minimum font size is configured in the browser.


Or 7702P at 2/3rds the price for the processor. Still pricey though.
the 7702p is 1p only. If you want a 2p 64 core its $7000 just $500 less than the 7742, but slower.
 

StefanR5R

Elite Member
Dec 10, 2016
5,498
7,786
136
2P computers are only interesting if density is a limiting factor, or if you have a HPC workload which scales to such core counts (e.g. fluid dynamics, but none of the BOINC related applications). Although, speaking of fluid dynamics, at some points such workloads want more memory channels = more sockets (per host, or with Infinband across hosts).

The figures confirm my earlier thought that a substantial upgrade from high-core-count 14 nm server CPUs, if one is power- and/or cooling-constrained, is provided only by the 64 core SKUs. (And the 48 core SKUs to a lesser degree, but there is no 48 core model in the cheaper single-socket-only -P lineup. Also, they have 3-core CCXs which can be a concern in multithreaded workloads, as opposed to multi-process workloads.)
I still am wondering of the perf/W of the 155W TDP, 32-core EPYC 7452. Though maybe a downside of that one could be that the portions of the power budget between CCDs and IOD shift towards the latter, in comparison with the higher TDP parts.
 
Last edited:

biodoc

Diamond Member
Dec 29, 2005
6,261
2,238
136
Like many others, I've been frustrated by AMD's delayed support for Zen 2 CPU temp/voltage monitoring on linux. That changed with the release of the 5.4 linux kernel. That along with new software called zenpower and zenmonitor, I was dying to try all this out. I'm not sure when this will all available for linux mint but probably not anytime soon. It could be months. I had a spare nvme drive on my new 3900X build so I installed manjaro linux on it. Manjaro is the "easy" distro for Arch linux. Software installation is quite different from debian based systems but fortunately the Arch Wiki contains a lot of information. I was able to install nvidia drivers, boinc, and FAH before the start of the month so my productivity did not suffer too much.

After a couple of weeks of tinkering, I now have this information at hand:

1575743645807.png
 

biodoc

Diamond Member
Dec 29, 2005
6,261
2,238
136
Ubuntu has a repository for upstream kernel builds.
https://wiki.ubuntu.com/Kernel/MainlineBuilds
Maybe it can be used on Mint too.

(Note, you cannot use the onboard methods of Mint to install out-of-the-mainline kernel drivers such as nvidia or virtualbox anymore if you switch to a mainline kernel build.)

I considered that but I wanted to maintain my Mint install as a fully functional fallback in case I ran into issues with Manjaro. It was just an experiment but it wasn't trivial to get zenpower working.
 

StefanR5R

Elite Member
Dec 10, 2016
5,498
7,786
136
Some of these perf/W tests were previously also presented for Threadripper gen 3 vs. gen 2 vs. Intel-HEDT vs. AMD top-end desktop, here:
AMD Ryzen Threadripper 3970X / 3960X Linux Benchmarks, page 12

Comparing "at the wall" perf/W of the 32 core CPUs, more is better ---

TR2
2990WX​
TR3
3970X​
Naples
7601​
Rome
7502​
John The Ripper, Blowfish (encryption)​
146​
153​
200​
240​
V-RAY, CPU (rendering)​
74​
129​
133​
213​
dav1d Summer Nature 4K (decoding)​
0.86​
1.30​
1.21​
2.00​
SVT-AV1, Enc Mode 8, 1080p (encoding)​
0.19​
0.36​
0.29​
0.56​
SVT-VP9, PSNR/SSIM Opt., Bos. 1080p (encoding)​
0.67​
2.25​
1.58​
2.07​
X265, H.265 1080p (encoding)​
0.22​
0.30​
0.28​
0.44​
Coremark, size 666 (synthetic)​
3800​
4200​
5400​
6700​
Stockfish 9 (chess)​
17e4​
24e4​
27e4​
38e4​
PHPBench (single-threaded)​
3800​
3750​
5230​
6090​

In performance-per-Watt, the "old" 14nm EPYC 7601 (which is operating at its sweet spot of the voltage-frequency curve) beats the new 7nm+1Xnm TR 3970X (which is driven at voltages and clocks far away from the sweet spot) in all of these tests, except for the video transcoding tests. I wish Phoronix showed respective results for FFTW, NAMD, or NAS with identical working sets.
 
  • Like
Reactions: biodoc