AMD Ryzen (Summit Ridge) Benchmarks Thread (use new thread)

inf64 · Nov 18, 2016

TheELF said:
Did you even read what I wrote? How can SMT come on top if the top is already reached?
SMT is about throughput anyway (MT as you stated) and not about speed so it doesn't even matter for finding out how fast they will be.

So you are denying that Zen core can execute MORE instructions per cycle in MT workloads per core than in ST workloads? Because that is what you are saying and is totally opposite from what AMD Zen architect answered @ HC.

AtenRa · Nov 18, 2016

TheELF said:
Did you even read what I wrote? How can SMT come on top if the top is already reached?
SMT is about throughput anyway (MT as you stated) and not about speed so it doesn't even matter for finding out how fast they will be.

There is no way you will saturate a wide core like ZEN without SMT.

cytg111 · Nov 18, 2016

TheELF said:
Did you even read what I wrote? How can SMT come on top if the top is already reached?
SMT is about throughput anyway (MT as you stated) and not about speed so it doesn't even matter for finding out how fast they will be.

So in your oppinion, if a core was born with 100 threads the correct way of measuring this cores IPC is with all 100 threads at full load?

TheELF · Nov 18, 2016

inf64 said:
So you are denying that Zen core can execute MORE instructions per cycle in MT workloads per core than in ST workloads? Because that is what you are saying and is totally opposite from what AMD Zen architect answered @ HC.

There is a fixed amount of commands per core,if one thread can use ALL of them(100% IN EVERY cycle) then SMT can't do anything to improve on this 100% usage.
If one thread can only use say 80% of available commands then SMT can give you 20% improvement.

TheELF · Nov 18, 2016

AtenRa said:
There is no way you will saturate a wide core like ZEN without SMT.

That's what I was saying from the beginning...

AtenRa · Nov 18, 2016

TheELF said:
There is a fixed amount of commands per core,if one thread can use ALL of them(100% IN EVERY cycle) then SMT can't do anything to improve on this 100% usage.
If one thread can only use say 80% of available commands then SMT can give you 20% improvement.

The problem is you will never get 100% from a single thread with a wide core like ZEN.

TheELF said:
That's what I was saying from the beginning...

No you haven't.

inf64 · Nov 18, 2016

Wow, sorry to say this but I have to stop discussing this matter with you Elf since you simply have no clue what you are talking about. AtenRa is way more patient man than I am :/.

master_shake_ · Nov 18, 2016

Arachnotronic said:
Bro...

http://www.anandtech.com/show/2378

wow a price drop you say doesn't change the fact it was released at 851 dollars.

http://www.anandtech.com/show/2112

jpiniero · Nov 18, 2016

On pricing, Remember that the 9590 was $300 when it was first released to the DIY market (at least according to AT, it was $900 at first for boutique OEMs lol). Didn't last that long at $300 as they cut it to $220 a month or two later. I do expect AMD to have crazy pricing for Zen initially and just cut it later.

bjt2 · Nov 18, 2016

raghu78 said:
Zen SR will be competing against Kabylake 4C/4T , 4C/8T and Broadwell 8C/16T when it launches in Q1 2017. In H2 2017 Zen will SR will have to contend with the monster Skylake HEDT. Even in best case scenarios with Broadwell IPC the 4C/8T Zen will have to contend with 10-15% higher IPC of Kabylake core i5 combined with roughly 20% higher OC headroom even if Zen SR can overclock to 4-4.2 Ghz as Kabylake looks to be able to easily hit 5 Ghz . So there is no way that AMD can price 4C/8T Zen even on par with core i5 kabylake unlocked as the kaby core i5 will pretty much dominate in the majority of desktop workloads which use upto 4 threads. The few apps which use 8 threads will still see core i5 kabylake come out pretty much ahead given the 20% higher max OC and 10-15% higher IPC. Zen's SMT is unlikely to give above 20% perf increase. So yeah there is no way 4C/8T Zen can launch for any price close to USD 300.

Mean IPC in consumer code is around 1/cycle or lower. HPC FPU code can reach 2.5 (e.g. Spec FP). Almost only power viruses can go over 3.
AMD can do 4 INT PLUS 4 FP PLUS 2 MEM.
INTEL can do 4 among FP and INT (and FP are limited to 2 true FP + 2 vecint) and 3-4 MEM.
So in some, if not most, cases AMD's SMT can even beat INTEL's...

TheELF · Nov 18, 2016

AtenRa said:
No you haven't.

Sure I have.

TheELF said:
It's not 40% faster,it's 40% more IPC,that's throughput not speed,it will only be 40% faster if you actually find a software that will be able to use all 10 instructions the ZEN core has available per cycle.
Which will be pretty difficult since there aren't many CPUs out there (if there are any) with 10 instructions per core,I guess that's why they went with blender instead of some "traditional" benchmark.

AtenRa · Nov 18, 2016

TheELF said:
Sure I have.

40% higher IPC means 40% faster Single Thread Performance at the same clocks.

You can have + SMT on top of that.

Arachnotronic · Nov 18, 2016

master_shake_ said:
wow a price drop you say doesn't change the fact it was released at 851 dollars.

http://www.anandtech.com/show/2112

The price was already where it was when Phenom launched. That's what matters.

TheELF · Nov 18, 2016

AtenRa said:
40% higher IPC means 40% faster Single Thread Performance at the same clocks..

That is what you say it means,if you would ask a court they would come to the same conclusion they came to when people asked them what a core is.

KTE · Nov 18, 2016

cytg111 · Nov 18, 2016

TheELF said:
That is what you say it means,if you would ask a court they would come to the same conclusion they came to when people asked them what a core is.

Dude.. I see your point however i gurantee that is not what everyone else thinks of ipc.. including amd.

SarahKerrigan · Nov 18, 2016

TheELF said:
It's not 40% faster,it's 40% more IPC,that's throughput not speed,it will only be 40% faster if you actually find a software that will be able to use all 10 instructions the ZEN core has available per cycle.
Which will be pretty difficult since there aren't many CPUs out there (if there are any) with 10 instructions per core,I guess that's why they went with blender instead of some "traditional" benchmark.

First off, Zen can only sustain decode of four ISA ops per cycle - not ten. Obviously the uop cache will help there, some ops will crack to multiple uops, etc, but even measuring uops, it's a maximum of six per cycle (which is still not ten.) The 40% here is realistically calculated from generic ST workloads, likely integer-heavy ones. The purpose of having so many functional units is to allow a favorable instruction mix within a given machine width, not to somehow force you to use all ten per cyc to hit maximum performance.

Second, on the assessment that 10 execution pipes (which seems to be what you're referring to) is unusually many -

Power8 is 8-wide at the frontend, 10-issue, and has 16 execution pipes. P9-SMT8 is wider.

Multiflow Trace went up to 28-wide, front to back, in a VLIW uarch.

Intel Poulson is 12-issue in back, backing up a 6-wide frontend.

moonbogg · Nov 18, 2016

Arachnotronic said:
Bogg, wanna try something interesting? Download Geekbench 4, downclock your CPU to 4.2GHz, run the test, and post a link to the results.

I want to see how SNB @ 4.2GHz does in this test and how it compares to the known XV @ 4.2GHz results that are out there.

3930k@4.2

https://browser.geekbench.com/v4/cpu/1089967

Arachnotronic · Nov 18, 2016

moonbogg said:
3930k@4.2

https://browser.geekbench.com/v4/cpu/1089967

Thank you, sir Bogg.

My Broadwell-E 6950X @ 4.2GHz (2.8GHz cache) gets 4585, so that implies that BDW IPC is about 16.4% higher than your SNB, so that's a good first "sanity check" on those results.

According to GB4, the best AMD A12-9800 on record does 2749 single core (@4.2GHz single core). Multiply that by 1.4x and you get...3848.

So, yeah, I'm thinking Zen is an AMD flavored Sandy Bridge.

EDIT: For teh lulz, I OC'd my cache to 3.6GHz and re-ran the test. Got 4600 on single thread. L3$ speed doesn't seem to have much of an impact on GB4 single thread performance.

moonbogg · Nov 18, 2016

Well...lets just hope we somehow borked this little experiment, because slower than Sandy with clocks around 4.2 is not going to be super exciting, but its still a great deal at $300 for 8/16 cores/threads. I think a lot of people would jump on an 8/16 Sandy-like chip for around $300. Good enough for decently high refresh gaming and much better at multi threaded stuff. I won't be buying it, but I can't wait for real benchies.

majord · Nov 19, 2016

Arachnotronic said:
Thank you, sir Bogg.

My Broadwell-E 6950X @ 4.2GHz (2.8GHz cache) gets 4585, so that implies that BDW IPC is about 16.4% higher than your SNB, so that's a good first "sanity check" on those results.

According to GB4, the best AMD A12-9800 on record does 2749 single core (@4.2GHz single core). Multiply that by 1.4x and you get...3848.

So, yeah, I'm thinking Zen is an AMD flavored Sandy Bridge.

EDIT: For teh lulz, I OC'd my cache to 3.6GHz and re-ran the test. Got 4600 on single thread. L3$ speed doesn't seem to have much of an impact on GB4 single thread performance.

did you look at the outliers which made up the score?

the most significant of which (from the first A12 set):

HTML5 DOM:

Bristol ridge :932
Haswell-E : 3908

= 420% IPC advantage

Then there's the inclusion of Memory performance in the test

I'm not saying all workloads should show a consistent performance delta between architectures, they never do, but you do have to be able to recognise an outlier that's so extreme it's capable of skewing an IPC comparison by 20-30%

This, and the inclusion of things like AES (Which work back in Excavators favor) are reasons why Geekbench is not very useful for architecture comparisons

deasd · Nov 19, 2016

Geekbench has so much memory test and specific instruction test, like AIDA64, I don't think a general score could tell anything.
I'd prefer something like Fritzchess or wPrime which have pure arithmetic and branch predicting but no other tricks with compiler and long instructions. And whatever Intel and AMD just have too little improvement in these tests since several years ago, this fits the impression that both Intel and AMD have struggled improving performance these years.

jpiniero · Nov 19, 2016

majord said:
I'm not saying all workloads should show a consistent performance delta between architectures, they never do, but you do have to be able to recognise an outlier that's so extreme it's capable of skewing an IPC comparison by 20-30%

I thought we had this discussion already... it's likely due to the 9800's lack of L3. The 8350 for instance gets around 3400 on the HTML 5 DOM test. If anything, it's great because it highlights weaknesses of a chip that you might not obviously see with just one test.

TheELF · Nov 19, 2016

deasd said:
I'd prefer something like Fritzchess or wPrime which have pure arithmetic and branch predicting but no other tricks with compiler and long instructions.

Why?How much of the daily software the average user runs you think works like this?

majord · Nov 19, 2016

jpiniero said:
I thought we had this discussion already... it's likely due to the 9800's lack of L3. The 8350 for instance gets around 3400 on the HTML 5 DOM test. If anything, it's great because it highlights weaknesses of a chip that you might not obviously see with just one test.

Well I'm sorry if it's been discussed before, I wasn't aware, but regardless, it may be great for highlighting corner case issues but not representative of performance in general. More importantly for the purpose of this thread, no use at all for comparing architectures, since you want benchmarks that are predicable, and not heavily influenced by cache of Mem bandwidth

AMD Ryzen (Summit Ridge) Benchmarks Thread (use new thread)

Diamond Member

Lifer

Lifer

Diamond Member

Diamond Member

Lifer

Diamond Member

Diamond Member

Lifer

Senior member

Diamond Member

Lifer

Lifer

Diamond Member

Senior member

Lifer

Senior member

Lifer

Lifer

Lifer

Senior member

Senior member

Lifer

Diamond Member

Senior member