CPU for Floats crunching

MubeenShahid

Junior Member
Sep 14, 2017
2
0
1
Dear all,

I must upgrade my machine (FX6300) for Fortran based scientific computing, and for this purpose I am trying to compare Ryzen 1700 and i7-7700K.
I found the CPU benchmarks on SiSoft website:
Link: Ryzen 1700
Link: i7-7700K

On these links (i.e. SiSoftware Official Ranker), the Double Precision GFLOPS of both machines are:
R7 - 1700 => 23.22 GFLOPS "16T"
i7- 7700K => 24.57 GFLOPS "8T"

But at the same time, the primitive benchmark on numerics, i.e. Processor Arithmetic differs widely:
R7 - 1700 => 219 GOPS "16T"
i7- 7700K => 150 GOPS "8T"

Now the double precision GFLOPS are definitely floating point operations on double precision real numbers (excluding integer operations). However Processor Arithmetic definitely includes integer arithmetic operations. My tasks are FPU heavy.

Considering processor arithmetic, R7-1700 is more powerful than i7-7700K, however considering GFLOPS, there is negligible difference.
The two temptations to buy Intel are that:
1. I can use Intel Fortran Compiler (on Windows 10), while for Ryzen I would be restricted to GFortran on Linux (because of Intel compiler's "cripple AMD" function), and
2. must spend extra 30€ to buy GPU in case I opt for R7-1700.

So my question is:
is SiSoftware's "Processor Arithmetic" test indicates a stronger FPU? and is it reliable enough to use it as base for deciding which CPU to buy?

Or simply: which (from these two) CPU has stronger FPU?

Best regards,
Mubeen
 

LTC8K6

Lifer
Mar 10, 2004
28,520
1,575
126
With the new Intel 6 core 12 thread 8700K coming soon, perhaps you should wait?
Here are the numbers we are looking at, along with how they compare to aggregated Core i7-7700K results in SiSoftware's SANDRA database:
  • Processor Arithmetic: 217.98 GOPS (versus 149.99 GOPS)—45 percent increase
  • Processor Multi-Media: 658.57 Mpix/s (versus 447.76 Mpix/s)—47 percent increase
  • Processor Cryptography: 10.47 GB/s (versus 9.34 GB/s)—12 percent increase
  • Scientific Analysis (Single Precision): 61.41 GFLOPS (versus 48.51 GFLOPS)—26 percent increase
  • Scientific Analysis (Double Precision): 32.11 GFLOPS (versus 24.40 GFLOPS)—32 percent increase
https://hothardware.com/news/intel-core-i7-8700k-coffee-lake-i5-8400-cpus-sisoft-sandra-benchmark
 
  • Like
Reactions: MubeenShahid

MubeenShahid

Junior Member
Sep 14, 2017
2
0
1
With the new Intel 6 core 12 thread 8700K coming soon, perhaps you should wait?

https://hothardware.com/news/intel-core-i7-8700k-coffee-lake-i5-8400-cpus-sisoft-sandra-benchmark

Thanks for the suggestion, I read about Intel's Coffee Lake and was considering the same, however I wish the prices may sink instead of increase for 2 extra cores.
From the link you posted, the GFLOPS of i5-8400K and i7-8700K are nearly the same ....
Scientific Analysis (Double Precision):
Core i5-8400K (6C/6T): 31.35 GFLOPS (i5-7600K: 26.72 GFLOPS)
Core i7-8700K(6C/12T): 32.11 GFLOPS (i7-7700K: 24.40 GFLOPS)

I guess this indicates that there is only one FPU per core and both threads in each core share the same FPU?
So considering performance/price, i5-8400K would be better choice then?
 

24601

Golden Member
Jun 10, 2007
1,683
39
86
If you want a CPU for serious number crunching and are compiling your own code (as you seem to be indicating), then there is no substitute for 78xx and 79xx series, as Skylake HEDT has quad channel DDR4 to better feed the CPU as well as extremely high performance compute cores that have high throughput for AVX, AVX2, AVX512.

The i9-7980XE has a teraflop of performance at stock clocks.

If you are on a limited budget it is still worth it to get at least the i7-7800x cpu for your type of workload.
 
Last edited:

LTC8K6

Lifer
Mar 10, 2004
28,520
1,575
126
Thanks for the suggestion, I read about Intel's Coffee Lake and was considering the same, however I wish the prices may sink instead of increase for 2 extra cores.
From the link you posted, the GFLOPS of i5-8400K and i7-8700K are nearly the same ....
Scientific Analysis (Double Precision):
Core i5-8400K (6C/6T): 31.35 GFLOPS (i5-7600K: 26.72 GFLOPS)
Core i7-8700K(6C/12T): 32.11 GFLOPS (i7-7700K: 24.40 GFLOPS)

I guess this indicates that there is only one FPU per core and both threads in each core share the same FPU?
So considering performance/price, i5-8400K would be better choice then?
I think you meant the i5-8600K, and yes it would be the better value.
 

TheGiant

Senior member
Jun 12, 2017
748
353
106
Well I am writing my own CFD code in fortran for more simple jobs or calculating in Fluent.

I have BDW-E Xeon with 4CH 256GB ECC RAM as my solver machine and I5-6600K@4.4GHz with 32GB ram as side calculations machine.

If you have good optimized code the AVX2 can be a real game changer. So In your case depending on the size and commercial aspects of your need I would buy i7-8700K or if you are serious Xeon-W with more cores and ECC.

If you are not so sure about your AVX optimisations Ryzen or threadripper is the way to go. It performs the same or slightly better with "general" code
 

richaron

Golden Member
Mar 27, 2012
1,357
329
136
On these links (i.e. SiSoftware Official Ranker), the Double Precision GFLOPS of both machines are:
R7 - 1700 => 23.22 GFLOPS "16T"
i7- 7700K => 24.57 GFLOPS "8T"

I've seen reports of the Bristol Ridge AM4 A12-9800 iGPU being capable of 500+ GFLOPS of DP floating point.

If true it would be ~1/3 the price for ~20 times better performance.
 

LTC8K6

Lifer
Mar 10, 2004
28,520
1,575
126
I've seen reports of the Bristol Ridge AM4 A12-9800 iGPU being capable of 500+ GFLOPS of DP floating point.

If true it would be ~1/3 the price for ~20 times better performance.
Then adding a cheap video card would be much faster and easier.
 

NTMBK

Lifer
Nov 14, 2011
10,240
5,026
136
I've seen reports of the Bristol Ridge AM4 A12-9800 iGPU being capable of 500+ GFLOPS of DP floating point.

If true it would be ~1/3 the price for ~20 times better performance.

Good luck compiling Fortran for it.
 
  • Like
Reactions: tamz_msc

richaron

Golden Member
Mar 27, 2012
1,357
329
136
Then adding a cheap video card would be much faster and easier.
Errmm.. You're obviously not up to date on the DP performance of "cheap" video cards. 500+ GFLOPS DP will cost a lot.

Good luck compiling Fortran for it.
Yeah, could be a problem. Though:
1) OP's last question was simply about FP performance.
2) Fortran is actually big in the HPC space, and with AMD's recent HPC push it could be possible. I would need more than a regular AMD naysayer posting the usual unproven naysaying to come to a conclusion...
 

NTMBK

Lifer
Nov 14, 2011
10,240
5,026
136
Yeah, could be a problem. Though:
1) OP's last question was simply about FP performance.
2) Fortran is actually big in the HPC space, and with AMD's recent HPC push it could be possible. I would need more than a regular AMD naysayer posting the usual unproven naysaying to come to a conclusion...

From the OP:

I must upgrade my machine (FX6300) for Fortran based scientific computing

Yes, I know Fortran is big in HPC. Doesn't mean that AMD have a compiler for it. Go find me a Fortran compiler for AMD GPUs. I'll wait.
 
  • Like
Reactions: tamz_msc

NTMBK

Lifer
Nov 14, 2011
10,240
5,026
136
I answered suggesting something else with 20 times the performance for 1/3 the price. I'll wait for you to prove that part wrong, otherwise maybe you should learn a little humility.

It's completely irrelevant to his problem. He wants to run Fortran code on it. It could have 2000 times the performance, but if it can't execute his code then it's useless.
 
  • Like
Reactions: tamz_msc

richaron

Golden Member
Mar 27, 2012
1,357
329
136
It's completely irrelevant to his problem. He wants to run Fortran code on it. It could have 2000 times the performance, but if it can't execute his code then it's useless.

I replied to the OPs question with an alternative line of thinking which could be much much more efficient.

You obviously had no idea about GPU capabilities when you tried arguing with me, and to a lesser extend you have no idea of the potential to run Fortran on GPUs. I posted to show other possibilities, real possibilities, and you have only posted to show your ignorance.
 

NTMBK

Lifer
Nov 14, 2011
10,240
5,026
136
I replied to the OPs question with an alternative line of thinking which could be much much more efficient. I didn't pretend to supply a solution within the confines of the question raised.

You obviously had no idea about GPU capabilities when you tried arguing with me, and to a lesser extend you have no idea of the potential to run Fortran on GPUs. I posted to show other possibilities, real possibilities, and you have only posted to show your ignorance.

And what evidence do you have that I was unaware of GPU capabilities? All I pointed out was that you can't run Fortran on it, so it doesn't solve his problem. Your "possibility" involves changing his entire workflow and codebase over to a different language just so that he can use a low end GPU, which makes very little sense.
 

LTC8K6

Lifer
Mar 10, 2004
28,520
1,575
126
Could we stop the pissing contest?

If the OP could just use a video card to solve his problem, I'm sure he would have done so by now.
 

sdifox

No Lifer
Sep 30, 2005
95,029
15,140
126
https://devblogs.nvidia.com/parallelforall/easy-introduction-cuda-fortran/

if that fits your workload, cuda may be the way to go

or go rpi3 cluster

http://thundaxsoftware.blogspot.ca/2016/07/creating-raspberry-pi-3-cluster.html

couple that to Tegra X2 boards, you can get breakout boards and active heat sinks

https://developer.nvidia.com/embedded/buy/jetson-tx2

or just skip the rpi3 cluster and get a few TX2 dev kits

https://developer.nvidia.com/embedded/buy/jetson-tx2-devkit


or go hog

http://connecttech.com/product/jetson-tx2-tx1-array-server/
 
Last edited:

scannall

Golden Member
Jan 1, 2012
1,946
1,638
136
I'm wondering if a used Power8 system would be better. You can find them on eBay. Beastly CPU's for high end computing, and Fortran. One on eBay now, 10/40 thread core/thread 3.4 Ghz Power8 CPU and 32 gig of ram for $7k.
 

jur

Junior Member
Nov 23, 2016
18
4
81
Check out ebay for xeons v4 (2696, 2697, 2698, 2699 v4). Some are quite cheap. Strong cores, lots of cache, high bandwidth, reasonable price ->hard to beat. you can get 18-22 core cpu for good 500$, sometimes even cheaper.
If you are fine with OpenCl, you can also try R9 280/290x or get a used Nvidia Titan.
 
Last edited:

cbn

Lifer
Mar 27, 2009
12,968
221
106
Errmm.. You're obviously not up to date on the DP performance of "cheap" video cards. 500+ GFLOPS DP will cost a lot.

Yep, that is true.....

However, a used Titan Black (about $400) does well at 1.3 TFlops of double precision (about three times more than the Titan Xp). It does lack ECC memory though-----> http://www.advancedclustering.com/hpc-cluster-blog-gtx-vs-tesla/

ECC memory includes extra memory bits designed to detect and fix memory errors, which is of paramount importance to the successful completion of high performance, double-precision code. ECC memory ensures that the results of computations run on a Tesla are the same every time; the same tasks run on a high-end GTX card like the Titan can vary from job to job. Clearly, for scientific computing, the Tesla offers the best consistency.
 
Last edited:

evilr00t

Member
Nov 5, 2013
29
8
81
I've seen reports of the Bristol Ridge AM4 A12-9800 iGPU being capable of 500+ GFLOPS of DP floating point.

If true it would be ~1/3 the price for ~20 times better performance.

I'd love to see these reports myself.

IIRC GCN3 (Tonga/Fiji) had 1:16 SP:FP ratio. The A12's IGP has about 1.1 SP TFLOPS, I'd be surprised to see 500 DP GFLOPS out of it.
 
Last edited:

cbn

Lifer
Mar 27, 2009
12,968
221
106
I've seen reports of the Bristol Ridge AM4 A12-9800 iGPU being capable of 500+ GFLOPS of DP floating point.

If true it would be ~1/3 the price for ~20 times better performance.

I'd love to see these reports myself.

IIRC GCN3 (Tonga/Fiji) had 1:16 SP:FP ratio. The A12's IGP has about 1.1 SP TFLOPS, I'd be surprised to see 500 DP GFLOPS out of it.


And here is article where A10-9700 was used:

http://diit.cz/clanek/bristol-ridge-double-precision

aida64_bristol_ridge_a10_9700_half_rate_dp_pcgh_pcgh.png


P.S. Would be very happy if someone could get ECC working with these APUs. (EDIT: This motherboard claims to support ECC with the 7th generation APUs. I'm sure there would be others that do as well)
 
Last edited:
  • Like
Reactions: MubeenShahid