AMD Ryzen (Summit Ridge) Benchmarks Thread (use new thread)

JDG1980 · Feb 16, 2017

The Stilt said:
Any specific workloads (must be available for Windows) you would like to see, besides the current ones?
Preferably open source, but that's not mandatory if the workload can otherwise be justified.

Floating Point:

3DPM V2.0b1 (Custom binary, ICL 2017)
Blackscholes (Custom binary, ICL 2017)
Blender 2.78.4 (Custom binary, MSVC 2015 + ICL 2017)
libBullet 2.85 (Custom binary, ICL 2017)
C-Ray (Custom binary, ICL 2017)
Caselab Euler3D (Public binary, ICL)
Cinebench 10 (Public binary, ICL)
Cinebench R11.5 (Public binary, ICL)
Cinebench R15 (Public binary, ICL)
Embree 2.13.0 (Public binary, ICL)
Euler3D CFD (Custom binary, ICL 2017)
GMPBench 0.2 / libGMP 6.12 (Custom binary, GCC 6.2)
Himeno (Custom binary, GCC 6.3)
Linpack 2017.014.0 (Public binary, ICL)
MCRT (Custom binary, ICL 2017)
NAMD 2.12 (Public binary, ICL)
NBody (Custom binary, ICL 2017)

Integer:

7Zip 16.04 x64 (Public binary, MSVC)
GCC 6.3 x86-64 (Public binary, GCC)
NQueen (Custom binary, iFortran)
OpenSSL 1.1.0d (Custom binary, GCC 6.2)
Stockfish 8 (Custom binary, GCC 6.3)
VampireNumbers (Custom binary, GCC 6.3)
X264 r2762 (Custom binary, GCC 5.3 + YASM)
X265 2+2 (Custom binary, GCC 6.3 + YASM)
WinRar 5.40 x64 (Public binary, MSVC?)

Some might wonder why ICL 2017 is the most common compiler used here.
That's because it is currently the fastest all-over compiler (for FP) for all of the µarch's I'm testing (XV, Zen, HSW, KBL).
Naturally the vendor dependent instruction dispatcher has been removed from all of the custom binaries. Needless to say that since the newer ICLs (>= 2011) are no longer hostile towards AMD, and removing the dispatcher generally makes no difference.
The only two workloads where removing the dispatcher does make a difference are Caselab Euler3D (ICL from 2009 (?)) and Linpack. In Caselab Euler3D removing the dispatcher improves the performance on AMD CPUs by >30%, while in Linpack the dispatcher doesn't degrade the actual performance but prevents the program from running on AMD CPUs alltogether.

It would be great if you could run the Dolphin CPU benchmark as well.

Doom2pro · Feb 16, 2017

JDG1980 said:
It would be great if you could run the Dolphin CPU benchmark as well.

Especially since it runs pretty rotten on AMD CPUs.

.vodka · Feb 16, 2017

I wonder where would Zen fit here.

I can't seem to find a result for Piledriver and Excavator. Probably well in the 20-30 minutes range

JDG1980 · Feb 16, 2017

.vodka said:
I wonder where would Zen fit here.

I can't seem to find a result for Piledriver and Excavator. Probably well in the 20-30 minutes range

Not quite that bad, but still well behind the Intel entries.

http://www.anandtech.com/show/10436/amd-carrizo-tested-generational-deep-dive-athlon-x4-845/4

.vodka · Feb 16, 2017

Excavator ties Sandy here? Interesting...

JDG1980 · Feb 16, 2017

.vodka said:
Excavator ties Sandy here? Interesting...

Higher RAM speed, perhaps? The X4 845 was tested with DDR3-2133, while Sandy Bridge only goes up to 1600 (at least officially). Also, Excavator is said to have some branch predictor improvements, which is going to help with emulation. If even Excavator is matching Sandy Bridge in a branch-heavy benchmark like this, it's good news for Zen.

lolfail9001 · Feb 16, 2017

.vodka said:
Excavator ties Sandy here? Interesting...

Well, considering that presumably Dolphin benefits from FMA, that could the explanation.

inf64 · Feb 17, 2017

JDG1980 said:
Not quite that bad, but still well behind the Intel entries.

http://www.anandtech.com/show/10436/amd-carrizo-tested-generational-deep-dive-athlon-x4-845/4

If 40% number holds in this benchmark (yes AMD said they exceeded it, but nonetheless):
3Ghz Ryzen 15.82 x 0.6 = 9.5.
Compare with Skylake's 9 and BDW's 9.8.

sm625 said:
I dont even think I can actually say why I went missing without "going missing" again. My new avatar should explain though. At any rate I'm glad people are looking into that questionable memory score.

There is another passmark baseline now, #774164. It scores roughly the same. I would post some screenshots from passmark, but apparently in north korea, you not even allowed to post screenshots of passmark.

Good find. Memory speed/latency definitely has an effect on some subtests. The latency is very bad when compared to intel parts.

I'm now leaning to the option of all scores being at the baseline 3.4Ghz clock and no Turbo was involved. Yes it sounds crazy but if there was Turbo involved then it would mean that AMD's Zen has up to 20% higher SMT gain Vs what intel latest core gets, which is not likely IMO. That and the crappy motherboard that was used seem to be main clues why I think it was done @ base clock.

lolfail9001 · Feb 17, 2017

inf64 said:
If 40% number holds in this benchmark (yes AMD said they exceeded it, but nonetheless):

It is actually 11.3, 40% faster means that it takes 1/1.4 time. New number, the 55%, lands it at 10.2 that is virtually Haswell-level... like the rest of leaks.

inf64 said:
Compare with Skylake's 9 and BDW's 9.8.

Looks pretty bad.

inf64 said:
Yes it sounds crazy but if there was Turbo involved then it would mean that AMD's Zen has up to 20% higher SMT gain Vs what intel latest core gets, which is not likely IMO.

Making judgements on SMT scaling only knowing performance SMT on is kinda crazier.

.vodka · Feb 17, 2017

sm625 said:
There is another passmark baseline now, #774164. It scores roughly the same.

Thanks. It's not about the message, but how you deliver it. 😉

Google search for "AMD Ryzen #774164" returns 1D3601A2M88F3_39/36_N. Putting that string in PT's baseline search:

This was uploaded on February 13... the first baseline #771904 was uploaded on February 10... hmm.. interesting.

Really? Still using that awful DDR4 2400 17-17-17-39 in a completely different system? That's either a really unfortunate coincidence or.. could Passmark be misreading memory speeds and timings...?

Anyway... this sample running at 3.6GHz base and 3.9GHz turbo (that is also reported as disabled, again) scores the following:

CPU mark: 15334
Integer: 41306 Mops/sec
Prime: 38 million primes /sec
Compression: 25505 KBytes/sec
Physics: 753 Frames/sec
CPU ST: 1980 Mops/sec
FP: 15256 Mops/sec
SSE: 742 Million Matrices /sec
Encryption: 4005 Mbytes/sec
Sorting: 15709 Thousand Strings/sec

Memory mark: 1792
Database: 76 Kops/sec
Memory read uncached: 14151 Mbytes/sec
Memory threaded: 34367 Mbytes/sec
Memory read cached: 23806 Mbytes/sec
Memory write: 7627 Mbytes/sec
Memory latency: 76 ns

------------------------------------------------------

The #771904 baseline had these scores:

CPU mark: 15084
Integer: 39672 Mops/sec
Prime: 37 million primes /sec
Compression: 24723 KBytes/sec
Physics: 726 Frames/sec
CPU ST: 2046 Mops/sec
FP: 14807 Mops/sec
SSE: 717 Million Matrices /sec
Encryption: 3865 Mbytes/sec
Sorting: 15204 Thousand Strings/sec

Memory mark: 1855
Database: 78 Kops/sec
Memory read uncached: 14915 Mbytes/sec
Memory threaded: 34011 Mbytes/sec
Memory read cached: 28006 Mbytes/sec
Memory write: 7917 Mbytes/sec
Memory latency: 76 ns

----------------------------------------

Easier to read table:

As expected for a 200MHz base clock increase... some within margin of error. Still using that horribly slow RAM. Nothing new really. Memory latency still horrible or worse. This is using that AMD golemit motherboard that showed up somewhere else before... it's probably not final hardware as the MSI A320 motherboard used in the first baseline... could explain the worse memory scores.

There was a picture going around explaining how to decode those strings... where was it..

coercitiv · Feb 17, 2017

.vodka said:
There was a picture going around explaining how to decode those strings... where was it..

Let me get that.

.vodka · Feb 17, 2017

Oh, it's an F3 stepping 1ES = first engineering sample?... The first leak #771904 is a newer CPU, F4, a qualification sample (QS?) that is probably final silicon. Therefore it makes sense for this ES to be used in that AMD golemit motherboard, whereas the other baseline used a retail motherboard.

This new leak is silicon in the process of tweaking, and it shows in the better memory scores and higher single threaded score of the first baseline #771904 if I'm reading it right, while being clocked lower, using the same awful memory.

Interesting...

cytg111 · Feb 17, 2017

lolfail9001 said:
It is actually 11.3, 40% faster means that it takes 1/1.4 time. New number, the 55%, lands it at 10.2 that is virtually Haswell-level... like the rest of leaks.

Looks pretty bad.

Making judgements on SMT scaling only knowing performance SMT on is kinda crazier.

common pitfall that division.. imagine a chip 100% faster.

Hitman928 · Feb 17, 2017

Ryzen would have to be ~60% faster than Carrizo in this bench to match BW.

Does dolphin use avx(2)?

itsmydamnation · Feb 17, 2017

The problem is the intel latency number are flat out wrong, There is no way I have a typical memory latency of 22ns, given my L3 latency is around 10ns.......

when i run the advanced benchmark i get 22.91ns to 70.49ns with no other detail. AIDA64 5.6 Cache and memory benchmark puts my:
3770k with 2000mhz 10-11-10 @ 57.9ns
8350 with 1833mhz ?-?-? (not rebooting my esxi server to find out ) @64ns.

edit: even look at Anandtech for skylake:
http://www.anandtech.com/show/9483/intel-skylake-review-6700k-6600k-ddr4-ddr3-ipc-6th-generation/9

inf64 · Feb 17, 2017

lolfail9001 said:
It is actually 11.3, 40% faster means that it takes 1/1.4 time. New number, the 55%, lands it at 10.2 that is virtually Haswell-level... like the rest of leaks.

Looks pretty bad.

Making judgements on SMT scaling only knowing performance SMT on is kinda crazier.

Math 101.
Baseline is 15.82 . Assume Zen is 40% faster than 15.82.
10% of 15.82 is 1.582, 40% is 4x1.582=6.328. So Zen is 6.328 minutes faster than 15.82 minute time => 15.82 - 6.328 = 9.49
You need to brush up on your math, the very basic stuff.

cytg111 · Feb 17, 2017

inf64 said:
Math 101.
Baseline is 15.82 . Assume Zen is 40% faster than 15.82.
10% of 15.82 is 1.582, 40% is 4x1.582=6.328. So Zen is 6.328 minutes faster than 15.82 minute time => 15.82 - 6.328 = 9.49
You need to brush up on your math, the very basic stuff.

Do that math again with ryzen++ that is 100% faster.. 🙂.. I want THAT chip 🙂

tamz_msc · Feb 17, 2017

cytg111 said:
Do that math again with ryzen++ that is 100% faster.. 🙂.. I want THAT chip 🙂

I see what you did there 😉

This is a very common mistake when comparing running times - because lower time taken means better, instead of scores and FPS numbers where higher is better.

Hitman928 · Feb 17, 2017

inf64 said:
Math 101.
Baseline is 15.82 . Assume Zen is 40% faster than 15.82.
10% of 15.82 is 1.582, 40% is 4x1.582=6.328. So Zen is 6.328 minutes faster than 15.82 minute time => 15.82 - 6.328 = 9.49
You need to brush up on your math, the very basic stuff.

Do it in reverse by finding a chip 40% slower than your calculated Ryzen time and see if you get 15.82 as the answer (you won't). Cytg111 is correct. Ryzen would need to be more than 60% faster than Carrizo if it's going to get that time in this benchmark.

itsmydamnation · Feb 17, 2017

I didn't realize 15.82*0.6 was so hard

Hitman928 said:
Do it in reverse by finding a chip 40% slower than your calculated Ryzen time and see if you get 15.82 as the answer (you won't). Cytg111 is correct. Ryzen would need to be more than 60% faster than Carrizo if it's going to get that time in this benchmark.

People need to learn the difference between

15.82 * 0.6 and 15.82 /1.4 and when to use them.

Inf 64 is right being 40% faster is completing something in 40% less time. 40% of 15.82 is 6.328.

Seriously go to calc.exe 15.82 - 40% = 9.492

inf64 · Feb 17, 2017

tamz_msc said:
I see what you did there 😉

This is a very common mistake when comparing running times - because lower time taken means better, instead of scores and FPS numbers where higher is better.

I guess I didn't take my coffee this morning xd. The runtime is the thing, you obviously cannot get 0 as an answer. 100% faster chip (2x as fast) would do it in half the time, 40% faster would do it like lolfail and cytg said, in around 11.3. I was wrong.That is if we assume 40% is the number, it could be faster or even slower.

.vodka said:
Thanks. It's not about the message, but how you deliver it. 😉

Really? Still using that awful DDR4 2400 17-17-17-39 in a completely different system? That's either a really unfortunate coincidence or.. could Passmark be misreading memory speeds and timings...?

Anyway... this sample running at 3.6GHz base and 3.9GHz turbo (that is also reported as disabled, again) scores the following:

CPU mark: 15334
Integer: 41306 Mops/sec
Prime: 38 million primes /sec
Compression: 25505 KBytes/sec
Physics: 753 Frames/sec
CPU ST: 1980 Mops/sec
FP: 15256 Mops/sec
SSE: 742 Million Matrices /sec
Encryption: 4005 Mbytes/sec
Sorting: 15709 Thousand Strings/sec

Memory mark: 1792
Database: 76 Kops/sec
Memory read uncached: 14151 Mbytes/sec
Memory threaded: 34367 Mbytes/sec
Memory read cached: 23806 Mbytes/sec
Memory write: 7627 Mbytes/sec
Memory latency: 76 ns

------------------------------------------------------

The #771904 baseline had these scores:

CPU mark: 15084
Integer: 39672 Mops/sec
Prime: 37 million primes /sec
Compression: 24723 KBytes/sec
Physics: 726 Frames/sec
CPU ST: 2046 Mops/sec
FP: 14807 Mops/sec
SSE: 717 Million Matrices /sec
Encryption: 3865 Mbytes/sec
Sorting: 15204 Thousand Strings/sec

Memory mark: 1855
Database: 78 Kops/sec
Memory read uncached: 14915 Mbytes/sec
Memory threaded: 34011 Mbytes/sec
Memory read cached: 28006 Mbytes/sec
Memory write: 7917 Mbytes/sec
Memory latency: 76 ns

----------------------------------------

Easier to read table:

As expected for a 200MHz base clock increase... some within margin of error. Still using that horribly slow RAM. Nothing new really. Memory latency still horrible or worse. This is using that AMD golemit motherboard that showed up somewhere else before... it's probably not final hardware as the MSI A320 motherboard used in the first baseline... could explain the worse memory scores.

There was a picture going around explaining how to decode those strings... where was it..

The ES part has higher base/Turbo but manages to score lower in ST test by 3%? It's a bit odd.

cytg111 · Feb 17, 2017

inf64 said:
...I was wrong...

And that right there puts you in a position of raw power on the interwebs. Tips hat.

Gideon · Feb 17, 2017

Ryzen 1600X CPU-Z
https://www.reddit.com/r/Amd/comments/5ul5yt/ryzen_1600x_cpuz_benchmark/

EDIT:
Stock 7700K is 2254/10038 (source)
6700K @ 4GHz is 2031/8554 (same "sauce")

So not too shabby at all. In fact, as I posted in the other thread, per-core scaling on Ryzen seems to be extremely good compared to intel.

EDIT2: is that a 3.4Ghz clock on 0.374V ? That can't be correct, right? right?!

tamz_msc · Feb 17, 2017

i7 5930K at 3.5GHz is 1667/10806. (CPU-Z reference)
According to the TPU thread, one poster with an i7 6850K 4.6GHz got 2137/13526.

Color me impressed!

Agent-47 · Feb 17, 2017

inf64 said:
Math 101.
Baseline is 15.82 . Assume Zen is 40% faster than 15.82.
10% of 15.82 is 1.582, 40% is 4x1.582=6.328. So Zen is 6.328 minutes faster than 15.82 minute time => 15.82 - 6.328 = 9.49
You need to brush up on your math, the very basic stuff.

Honestly both are correct as long as you say it right. I.e. KL is 40% faster compared to XV. Or XV 60℅ slower compared to KL

AMD Ryzen (Summit Ridge) Benchmarks Thread (use new thread)

Golden Member

Senior member

Golden Member

Golden Member

Golden Member

Golden Member

Golden Member

Diamond Member

Golden Member

Golden Member

Diamond Member

Golden Member

Lifer

Diamond Member

Diamond Member

Diamond Member

Lifer

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Lifer

Platinum Member

Diamond Member

Senior member