AMD Ryzen (Summit Ridge) Benchmarks Thread (use new thread)

Page 47 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Status
Not open for further replies.

Abwx

Lifer
Apr 2, 2011
11,885
4,873
136
I think Phynaz meant that AMD should do what you are saying -- send a Zen chip to a popular website and let them go to town showing how it curb stomps Broadwell-E :)

They will, once it s launched, or did any manfacturer of anything send an ES to any site before the product was launched, tell us when this already happened, i m much interested to see if any other firm ever bowed to such irrealistic and irrational demands...

The question is quite simple: you cannot use all available ports all the time, even with such kind of favorable code.

Not sure that your statement is not self contradictory..

In Blender it is obvious that HW for instance doesnt execute 2FP MUL or 1 FP MUL + 1 FP ADD for a single thread each cycle, otherwise there wouldnt be enough ressource left to gain 50-60% when pushing a second thread in the same core, the ratio suggest that the code has dependencies such that only 1.3 FP ops per thread and per cycle are executed.

This explain both HW huge SMT gain and Zen inability to do much better than BDW despite a more adequate FPU.

If each cycle the first thread does 1.3 FP then Zen could theoricaly execute 2.6 FP ops/cycle when using SMT, that is 30% more than Haswell, but of course this would require that the ops repartition in the code is such that a unit comprising 2 FP MUL + 2 FP ADD could provide 30% more ops/cycle than the unit comprising 2 FP MUL or 1 FP MUL + 1 FP ADD, wich is of course unlikely, for instance if the ops are mainly FP MULs and few FP ADDs then the two cores will yield about the same throughput in both ST and SMT..
 
Last edited:

cdimauro

Member
Sep 14, 2016
163
14
91
I think that you've forgot that even FP-intensive code has to execute scalar/integer instructions, as well as make load/store from/to memory.

In the case of Zen, it was explained by another guy here that an FP instruction which accesses memory needs 2 uops, one that goes to the L/S unit, and another one to one FPU port (when it has the value coming from memory). So, it keeps busy two different ports.

And you have the limit of 6 uops dispatched per cycle by the Micro-op cache, even with 10 total ports available each cycle...
 

Abwx

Lifer
Apr 2, 2011
11,885
4,873
136
In the case of Zen, it was explained by another guy here that an FP instruction which accesses memory needs 2 uops, one that goes to the L/S unit, and another one to one FPU port (when it has the value coming from memory). So, it keeps busy two different ports.

And you have the limit of 6 uops dispatched per cycle by the Micro-op cache, even with 10 total ports available each cycle...

You are assuming that all series of dispatched blocs of 6 uops require a single cycle, that s not the case, the 6 dispatched uops/cycle will eventually necessitate more than one cycle to be completed by the exe units, if an extra cycle is required at a point then there will be up to 12 uops in the schedulers entries (dispatched during the normal and the extra cycle) ready to be sent to the exe unit the next cycle, so each time there s the need of an extra cycle there will be more than 6 uops scheduled once the extra cycle is over, hence the 10 parraleled ports.
 

KTE

Senior member
May 26, 2016
478
130
76
Someone (who is interested enough) recalculate those Zeppelin GB ST figures with assumption that the CPU worked at 1.0GHz, instead of 1.45GHz? 1.0GHz is a valid frequency state for the SKU used in that leak and the lowest (plausible, non PG) of the available states the CPU could have operated at.

If it still doesn't put it within ~20% of the IPC of Intel wells and lakes, then I'd say it is doing a DAR while running the benchmark. If not, AMD's lies have reached a completely new level.
My IVB 3667U at 1.1GHz scores the same as that Zen in ST.

Looking at past frequency power modes, if I was a betting man, that to me is pretty certain a test of Zen at 1GHz...

No chance I can see IVB having a >30% per clock lead over Zen, as that test is showing. Not even by AMDs poor pre-launch standards since 2006.

That would place Zen with IPC between SNB and HSW here.
Well, there's one result of a 2961Y (Haswell, 1.1 Ghz) which gets 1258 ST in GB4. Couple of results for 847 (Sandy Bridge, 1.1 Ghz) which the median is roughly 1122 once you throw out the really lowball scores. Ignoring AES, it's pretty competitive to that Zen Server result. So it's plausible that the ZS is running at only 1 Ghz.
In these recent CPUs, turbo boost and power states can confuse everything so such benchmark analysis is generally incorrect. Example,

My IVB 17W is supposed to be at 2GHz stock, but turbos to 3.2/3.0GHz stock. It stays there in MT. That's 50% higher clocks in MT.

Then, if I turn off TB, it still sticks at 2.5GHz in ST/MT load.

Sent from HTC 10
(Opinions are own)
 

cdimauro

Member
Sep 14, 2016
163
14
91
Nice to talk about the Turbo. AFAIK, it was disabled in the Blender test, and in this case we cannot even take this result as a "good" one for performance measurement.
 

jpiniero

Lifer
Oct 1, 2010
16,818
7,258
136
The problem with the 1 Ghz idea is that the MT score is so low that even if you gave it 40% more it would still be terrible given 64 cores. I still think it's the MCM.

In these recent CPUs, turbo boost and power states can confuse everything so such benchmark analysis is generally incorrect. Example,

I was thinking the same thing, hence why I chose chips without any turbo. I'm sure the craptops still throttle though.
 
Mar 10, 2006
11,715
2,012
126
And that's exactly where the "40% IPC improvement over Excavator" should put Zen, given that the statement itself is accurate.

so I did some digging and apparently AMD promised 20% IPC boost in Steamroller over Piledriver. Was this actually accurate? From what I can tell, this wasn't the case across the board at all. Not even close.

5OT7TX3.png


We are working under the assumption that AMD's 40% number is actually legit, but they have fluffed up the numbers before...
 
Last edited:
  • Like
Reactions: CHADBOGA

The Stilt

Golden Member
Dec 5, 2015
1,709
3,057
106
so I did some digging and apparently AMD promised 20% IPC boost in Steamroller over Piledriver. Was this actually accurate? From what I can tell, this wasn't the case across the board at all. Not even close.

5OT7TX3.png


We are working under the assumption that AMD's 40% number is actually legit, but they have fluffed up the numbers before...

In the past AMD has also provided average figures for the IPC improvements. PD to SR was 10% on average and SR to XV was 5%.

kaveri_compute.jpg


AMD-Carrizo-APU_28nm-x86-5-IPC.jpg
 

Abwx

Lifer
Apr 2, 2011
11,885
4,873
136
The problem with the 1 Ghz idea is that the MT score is so low that even if you gave it 40% more it would still be terrible given 64 cores. I still think it's the MCM.
.

How do you know that it s 64C and not 32C/64T. ?..

Because there s no mention of the number of threads in the submission, only 2 CPUs/64C, that could be 64C without SMT but then i wouldnt see the purpose to test a non fully functional chip or eventually to set off SMT..

Scaling get up to 40+ or so in some tests, within the mentioned conditions this would suggest about 25% SMT gain if ever the plateform is 2 x 16C/32T = 32C/64T and assuming that the soft scale at 100%, if that s not the case the SMT gain should be increased by the software scaling ratio inverse.

Besides this Zen plateform has apparently much lower RAM bandwith than the Bristol Ridge ES, assuming this latter use 2 x 2400MHz the Zen plateform is running at 1600MHZ RAM on two channels.

http://browser.primatelabs.com/geekbench3/compare/6032158?baseline=8076878
 

Glo.

Diamond Member
Apr 25, 2015
5,930
4,991
136
My IVB 3667U at 1.1GHz scores the same as that Zen in ST.

Looking at past frequency power modes, if I was a betting man, that to me is pretty certain a test of Zen at 1GHz...

No chance I can see IVB having a >30% per clock lead over Zen, as that test is showing. Not even by AMDs poor pre-launch standards since 2006.

That would place Zen with IPC between SNB and HSW here.

In these recent CPUs, turbo boost and power states can confuse everything so such benchmark analysis is generally incorrect. Example,

My IVB 17W is supposed to be at 2GHz stock, but turbos to 3.2/3.0GHz stock. It stays there in MT. That's 50% higher clocks in MT.

Then, if I turn off TB, it still sticks at 2.5GHz in ST/MT load.

Sent from HTC 10
(Opinions are own)
Between Sandy Bridge and Haswell is... Ivy Bridge Architecture.

That is actually not bad at all. The last bit unknown would be the core clocks of the CPUs.
 

mikk

Diamond Member
May 15, 2012
4,299
2,383
136
Between Sandy Bridge and Haswell is... Ivy Bridge Architecture.

That is actually not bad at all. The last bit unknown would be the core clocks of the CPUs.


This would be extremely poor. Did you read properly? A much lower clocked Ivy Bridge with half the L3 Cache as fast as Zen @ ST. This is so poor that something can't be right.
 

Glo.

Diamond Member
Apr 25, 2015
5,930
4,991
136
This would be extremely poor. Did you read properly? A much lower clocked Ivy Bridge with half the L3 Cache as fast as Zen @ ST. This is so poor that something can't be right.
I was talking about IPC. If it really is on the level of Ivy Bridge then it is pretty good. My 2.3 GHz Macbook Pro mid 2012 had Ivy Bridge CPU and it scored in GB around 3000 pts.

This is very important for any APU.
 

lolfail9001

Golden Member
Sep 9, 2016
1,056
353
96
How do you know that it s 64C and not 32C/64T. ?..

Because there s no mention of the number of threads in the submission, only 2 CPUs/64C, that could be 64C without SMT but then i wouldnt see the purpose to test a non fully functional chip or eventually to set off SMT..

Scaling get up to 40+ or so in some tests, within the mentioned conditions this would suggest about 25% SMT gain if ever the plateform is 2 x 16C/32T = 32C/64T and assuming that the soft scale at 100%, if that s not the case the SMT gain should be increased by the software scaling ratio inverse.

Besides this Zen plateform has apparently much lower RAM bandwith than the Bristol Ridge ES, assuming this latter use 2 x 2400MHz the Zen plateform is running at 1600MHZ RAM on two channels.

http://browser.primatelabs.com/geekbench3/compare/6032158?baseline=8076878
From the fact that it lists caches with 32 multiplier.
 
  • Like
Reactions: Dresdenboy

The Stilt

Golden Member
Dec 5, 2015
1,709
3,057
106
This would be extremely poor. Did you read properly? A much lower clocked Ivy Bridge with half the L3 Cache as fast as Zen @ ST. This is so poor that something can't be right.

Have you even looked what kind of IPC Excavator has? Sandy / Ivy Bridge kind of IPC on average is exactly what you should get when you increase it by 40%.
 

inf64

Diamond Member
Mar 11, 2011
3,884
4,692
136
I think it is paramount for AMD to have an IPC halfway between IB and Haswell (~5% faster than IB and ~5% slower than Haswell).

If we look at AT generational IPC comparison we can see that this would put Zen at ~10-12% lower IPC than Skylake (in non AVX256/FMA code). They would also need to bump the base and turbo to around 3.2/3.3 and 3.7Ghz in order to be competitive with 8C Broadwell SKUs. Not an impossible scenario but a lot of things have to come together for all that to happen.
 
Last edited:

mikk

Diamond Member
May 15, 2012
4,299
2,383
136
I was talking about IPC. If it really is on the level of Ivy Bridge then it is pretty good.


You have to read properly. You told between Sandy and Haswell which is not the case when Ivy Bridge @1.1 Ghz= Zen 1.44 Ghz

And that is a an Ivy Bridge with 4 MB L3.
 

lolfail9001

Golden Member
Sep 9, 2016
1,056
353
96
You have to read properly. You told between Sandy and Haswell which is not the case when Ivy Bridge @1.1 Ghz= Zen 1.44 Ghz

And that is a an Ivy Bridge with 4 MB L3.
We are making a brave assumption that that sample could be running at 1Ghz.
 

Sweepr

Diamond Member
May 12, 2006
5,148
1,143
136
If we look at AT generational IPC comparison we can see that this would put Zen at ~10-12% lower IPC than Skylake (in non AVX256/FMA code).

Except you picked the absolute worst case scenario for Skylake IPC gain out of all reviews. According to Hardware.fr it's 18.25% faster than IB per clock in applications and 21.15% in games. Based on PCLab it's 18.8%/22.7% faster in applications/games.

zaawansowane.png


gry.png
 
Last edited:

bjt2

Senior member
Sep 11, 2016
784
180
86
So, do you plan to see Zen running better with ST code, since with MT it doesn't seem to shine?
No, the opposite... Since in MT seems to go well (blender test), and since the SMT gain should be greater than INTEL's due to much more ports, we can deduce that the ST should be inferior to INTEL's, because in MT, with probabily superior SMT AMD is about on par...

EDIT: i am talking of IPC only... If AMD, for instance, can clock higher in ST due to lower FO4, AMD can go on par or even beat INTEL even inST...
 

inf64

Diamond Member
Mar 11, 2011
3,884
4,692
136
If you calculate the difference from AT review you will find out that Skylake is 18% faster ( 1.112 x 1.033 x 1.027~= 1.8 ) than IB and thus confirming both hardware.fr and PClab findings for app IPC boost. Pairing Haswell with super fast DDR3 RAM would basically match the boost Skylake has from super fast DDR4 in games.
 

Sweepr

Diamond Member
May 12, 2006
5,148
1,143
136
Pairing Haswell with super fast DDR3 RAM would basically match the boost Skylake has from super fast DDR4 in games.

Skylake with the exact same DDR3 kit is still 16.8/16.5% faster than IB @ applications/games in Hardware.fr. Regarding PCLab, they paired the DDR3 systems with a very capable 2133 9-9-10-24 1N kit, and chose DDR4-2666 16-17-17-36 2N for Skylake (a far cry from 'super fast DDR4'), so I don't think there's any significant advantage here (if at all) - your 10-12% number is off. That is, if Zen can match Ivy Bridge IPC anyway.
 

cdimauro

Member
Sep 14, 2016
163
14
91
No, the opposite... Since in MT seems to go well (blender test), and since the SMT gain should be greater than INTEL's due to much more ports, we can deduce that the ST should be inferior to INTEL's, because in MT, with probabily superior SMT AMD is about on par...

EDIT: i am talking of IPC only... If AMD, for instance, can clock higher in ST due to lower FO4, AMD can go on par or even beat INTEL even inST...
Having more ports doesn't mean that you can have better performance, and the Blender test clearly shows that (with Zen having also double the FP computing capability): 2% more is negligible result.

P.S. Talking of IPC, as well.
 
Status
Not open for further replies.