Ideas for high end simulation workstation

bobsmith1492

Diamond Member
Feb 21, 2004
3,875
3
81
Hi everyone,

I put together a beefy machine 3 years ago for EM simulation using CST.

It has the following needs:
1. Single-core processing speed (for 3d modeling - math-based, lots of trig)
2. Multi-core processing speed (for certain phases of the simulation process)
3. GPU processing speed (for other phases of the simulation process, specifically needs GPU RAM bandwidth)

We're moving this machine on and I'm planning for an improved version.

Budget target is around $10K.

I'm a little stuck on the CPU side. It seems processors aren't getting much cheaper. And, they're particularly expensive to keep the base clock speed 3GHz or up where I'd like to be.

Here is my research to date. The first system on the "System Options" tab ("Scott's System") is the baseline machine we've been using.

https://www.dropbox.com/s/5chljh7eowkhwmd/2017-12-17 Server options.xlsx?dl=0

I'd like the EPYC if the peak clock speed were higher (and parts more readily available, esp. motherboards...)

I'd love dual Gold 6146 but they're pricey.

So I'm leaning toward the older Xeon parts: E5-2690v2 or E5-2687Wv4

Any suggestions, any parts or families I should look at under these conditions?

Thanks!
 

Atari2600

Golden Member
Nov 22, 2016
1,409
1,655
136
Your gonna have to break down how much of your task is spent doing each portion of 1, 2 and 3 - and then consider how much of it is interactive.

Is it something you typically can run over night? After all, if you beat the magic 10 hrs figure for the non-interactive bit, then you can focus more on the hardware needs for the interactive bit.

How is your software affected by Amdahl? What about SMT?

[I've been running CFD & FEA on and off for years, so while not in EM, not clueless about speccing either]
 

bobsmith1492

Diamond Member
Feb 21, 2004
3,875
3
81
Atari, I have a good baseline as we use this tool probably half of the day. For single thread operations I'd like to have an improvement so at least 3.5GHz peak turbo speed.

For multithreaded simulation it doesn't use the available SMT threads.

Overnight - we do some longer simulations overnight, occasionally, (optimization runs for example) but lots of shorter ones throughout the day too to try out ideas. That's more important.

I'm not sure how well it can use more cores. It uses all 20 now quite effectively though.
 

Atari2600

Golden Member
Nov 22, 2016
1,409
1,655
136
Threadripper also has ECC support.

16c threadripper @ 3.4GHz base is ~ 5% faster than 20c Haswell @ 2.6 GHz. (given they are similar CPUs in performance we can reasonably assume IPC parity)
It'd also throttle up to 4.2 GHz for single core (vs 3.3 GHz for haswell turbo), or 27% quicker.

It doesn't give you the same leg up for the overnight stuff as an EPYC, but would be more advantageous for your single thread interactive work during the day.

Any Amdahl effects will work in your favour having fewer cores.


Even allowing for a 10% IPC advantage (does your software take advantage of AVX512?), if not, an E5-2699A v4 is gonna be similar to a TR in multithread (2% which I'd call a wash, Amdahl effects are likely to mean 16 threads are at least 2% better than 22 threads) and 6% slower in ST.

While this would be more or less equal given guesses made, the TR would also allow you to clock up your memory, improving bandwidth.



Is a 25% improvement in ST performance (with no appreciable difference in MT) worth dropping 10K on?

Or would you be better served by putting some of that money onto your existing workstation in upgrades, using it for offloading your computations to and buying a 4 core high clocked E3-1285 v6 for your interactive work?
 
  • Like
Reactions: lightmanek

thecoolnessrune

Diamond Member
Jun 8, 2005
9,672
578
126
Is the software NUMA aware? What is the manufacturer's guidelines on using multiple sockets? What's the scale factor of the software? Is the software Grid capable? Same questions apply to the the GPU Portion.

If this really is heavy lifting software you're using, the manufacturer should be able to provide some guidelines to you for the above questions that can be leveraged.
 

Burpo

Diamond Member
Sep 10, 2013
4,223
473
126
An I9 7940X beats your current dual cpu setup, and ticks all the check boxes for what you need. Can put together a heck of a workstation for <$10k too..
 

Atari2600

Golden Member
Nov 22, 2016
1,409
1,655
136
An I9 7940X beats your current dual cpu setup, and ticks all the check boxes for what you need. Can put together a heck of a workstation for <$10k too..

But does not have ECC memory support. Given that all three machines listed in their internal comparison have ECC, I assumed that is a deal breaker.
 
  • Like
Reactions: scannall and IEC

bobsmith1492

Diamond Member
Feb 21, 2004
3,875
3
81
Huh, I didn't even think about Threadripper. It is in the right performance range and I like the 4ghz boost clock.

The only downside is the 4 channels of memory. I have been in contact with the vendor though we mostly discussed the GPU side of things. They highly recommended a dual processor arrangement partly for the increased memory bandwidth. I'd assume it's NUMA aware if they recommend dual processor arrangements.

I like that e5-2699 but it has the same caveat with the memory. The i9 is definitely a step backwards... Dual Threadrippers would be great but I don't see that is an option.

I totally agree on preferring higher clocks vs. more cores. If I could do a dual 8 core 3.8+ghz, I would. I don't see that existing either. Could do a quad 4 core 3.6ghz Xeon platinum for $30k... Or not!

I'm leaning toward the dual E5-2687Wv4 so far.
 

tamz_msc

Diamond Member
Jan 5, 2017
3,821
3,642
136
From what I know, EM simulations are not as dependent on memory architecture compared to FEA or CFD. Since you say that single-threaded use involves lot of computations of trig functions, you should look into CPUs with good AVX2/AVX512 support, which means Intel, since evaluation of math functions can be sped up very well using wide vector-SIMD.

Since you are concerned about keeping 3GHz+ frequencies, especially for single-thread intensive simulations, have a look at the Xeon Gold 6154 instead. Now the best thing about it is that, according to Anandtech, turbo frequency on that chip is such that it stays at 3.7GHz on all cores on non-AVX workloads, and doesn't drop below 3.3GHz on all cores even with AVX2. When it does go below 3GHz, it's due to AVX512 on all cores, which I doubt will concern you as much. Even then it's going to stay above 3GHz for up to 12 cores.

It doesn't cost much more than the 6150 that you've shortlisted either, so it seems to me that this chip should be ideally suited for your work.
 

bobsmith1492

Diamond Member
Feb 21, 2004
3,875
3
81
Thanks for the pointer tamz, it looks like those processors actually can turbo higher than what I've been using (the "turbo" number for the Ark site).

Here's a great overview (see page 13):
https://www.intel.com/content/dam/w...ication-updates/xeon-scalable-spec-update.pdf

That tells me the 6146 is actually my favorite since it'll do 4.2GHz. A single part gets me within 10% of our existing setup (with Amdahl's and IPC improvements probably making that up), with potentially ~doubled multicore performance by adding a second processor.

I might start with a single processor and add a second one later.
 

tamz_msc

Diamond Member
Jan 5, 2017
3,821
3,642
136
Thanks for the pointer tamz, it looks like those processors actually can turbo higher than what I've been using (the "turbo" number for the Ark site).

Here's a great overview (see page 13):
https://www.intel.com/content/dam/w...ication-updates/xeon-scalable-spec-update.pdf

That tells me the 6146 is actually my favorite since it'll do 4.2GHz. A single part gets me within 10% of our existing setup (with Amdahl's and IPC improvements probably making that up), with potentially ~doubled multicore performance by adding a second processor.

I might start with a single processor and add a second one later.
I'm actually surprised by the document you provided - could you please specify where you got it from as it gives all the turbo frequency data, something Intel was criticized for not providing during launch?
 

bobsmith1492

Diamond Member
Feb 21, 2004
3,875
3
81
Ok, we got it built! It's a beast. Ended up with this parts list:
Xeon Gold 6146 (12c 3.2GHz) x 2
SuperMicro X11DPI-N
12 x 16GB 2666 = 192GB (was going to use 8GB but were out of stock... bummer)
Samsung 960 EVO 1TB SSD
WD4000FYYZ 4TB 7200
Cooler Master HAF X
Ultra 1000W Titanium (ED: ran out of these, got an EVGA Supernova T2)

Quadro K2000
SNK-P0070APS4
heatsinks x2

What should I run as a benchmark? It's a Windows box.

Here are some fun pictures. Yes that's a very old version of Memtest... and it took 4.5hr to run. Not sure if it makes sense to even do with ECC memory.
Temperatures are good running Prime95.
It seems to be almost 2x as fast as the fastest multicore in CpuZ's benchmark (i9-7980XE).

2018-03-28%2015.10.30.jpg


2018-03-28%2015.10.35.jpg

2018-03-29%2009.12.05.jpg


2018-03-29%2015.32.09.jpg

2018-03-30%2013.14.52.jpg

2018-03-30%2014.38.31-1.jpg

2018-03-30%2016.11.15.jpg

2018-03-30%2016.19.32.jpg
 
  • Like
Reactions: tamz_msc

tamz_msc

Diamond Member
Jan 5, 2017
3,821
3,642
136
Try the free version of SPECwpc 2.1. There are a lot of benchmarks to run depending on your use case. It's pretty massive for a benchmark though(3.6GB).
 

benylema

Junior Member
Feb 11, 2020
1
0
6
Ok, we got it built! It's a beast. Ended up with this parts list:
Xeon Gold 6146 (12c 3.2GHz) x 2
SuperMicro X11DPI-N
12 x 16GB 2666 = 192GB (was going to use 8GB but were out of stock... bummer)
Samsung 960 EVO 1TB SSD
WD4000FYYZ 4TB 7200
Cooler Master HAF X
Ultra 1000W Titanium (ED: ran out of these, got an EVGA Supernova T2)
Quadro K2000
SNK-P0070APS4
heatsinks x2

What should I run as a benchmark? It's a Windows box.

Here are some fun pictures. Yes that's a very old version of Memtest... and it took 4.5hr to run. Not sure if it makes sense to even do with ECC memory.
Temperatures are good running Prime95.
It seems to be almost 2x as fast as the fastest multicore in CpuZ's benchmark (i9-7980XE).

2018-03-28%2015.10.30.jpg


2018-03-28%2015.10.35.jpg

2018-03-29%2009.12.05.jpg


2018-03-29%2015.32.09.jpg

2018-03-30%2013.14.52.jpg

2018-03-30%2014.38.31-1.jpg

2018-03-30%2016.11.15.jpg

2018-03-30%2016.19.32.jpg
I saw one of your posts about erasing the content on eeprom using a high voltage signal . Are there any devices that do that ? Please let me know
 

Atari2600

Golden Member
Nov 22, 2016
1,409
1,655
136

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,564
14,520
136
how long you plan to keep this system considering what is out now and will be coming out?
Yes, WOW, I just looked that up, they are over $2700 each and a pair is $5400. The 3960x will beat those easy and is only $1400. The motherboards are less also for TRX40.
 

scannall

Golden Member
Jan 1, 2012
1,946
1,638
136
Yes, WOW, I just looked that up, they are over $2700 each and a pair is $5400. The 3960x will beat those easy and is only $1400. The motherboards are less also for TRX40.
This is a necro thread. The original post is 2 years old.
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,564
14,520
136
This is a necro thread. The original post is 2 years old.
I know, I looked up TODAYs price on newegg and ebay. The retail is like 5 grand. ebay has it for $2700. Kind of puts the 3960x in a really good light 2 years later. WAY less power, more performance and way cheaper.
 

TheGiant

Senior member
Jun 12, 2017
748
353
106
Hi everyone,

I put together a beefy machine 3 years ago for EM simulation using CST.

It has the following needs:
1. Single-core processing speed (for 3d modeling - math-based, lots of trig)
2. Multi-core processing speed (for certain phases of the simulation process)
3. GPU processing speed (for other phases of the simulation process, specifically needs GPU RAM bandwidth)

We're moving this machine on and I'm planning for an improved version.

Budget target is around $10K.

I'm a little stuck on the CPU side. It seems processors aren't getting much cheaper. And, they're particularly expensive to keep the base clock speed 3GHz or up where I'd like to be.

Here is my research to date. The first system on the "System Options" tab ("Scott's System") is the baseline machine we've been using.

https://www.dropbox.com/s/5chljh7eowkhwmd/2017-12-17 Server options.xlsx?dl=0

I'd like the EPYC if the peak clock speed were higher (and parts more readily available, esp. motherboards...)

I'd love dual Gold 6146 but they're pricey.

So I'm leaning toward the older Xeon parts: E5-2690v2 or E5-2687Wv4

Any suggestions, any parts or families I should look at under these conditions?

Thanks!
what is your license model? I mean if it is dependant on thread count or not
 

Kocicak

Senior member
Jan 17, 2019
982
973
136
what is your license model? I mean if it is dependant on thread count or not
Original poster did not post on this forum for over one and half years! I am not sure what was the point of reviving this thread when the OP is not active anymore.
 

fkoehler

Member
Feb 29, 2008
193
145
116
I wasn't aware there was some rule about that...
Also, I could posit the same reply to your reply.

Not all of us are looking up OP activity before replying to posts.