AMD Ryzen (Summit Ridge) Benchmarks Thread (use new thread)

Abwx · Oct 1, 2016

The Stilt said:
Someone (who is interested enough) recalculate those Zeppelin GB ST figures with assumption that the CPU worked at 1.0GHz, instead of 1.45GHz? 1.0GHz is a valid frequency state for the SKU used in that leak and the lowest (plausible, non PG) of the available states the CPU could have operated at.

FP is what is most immune from RAM limitation, so looking at the individual FP ST scores the chip is about 2.7x slower than a Bristol Ridge XV at 4GHz, this imply that at 1.44GHz Zen would be as fast as a 1.48GHz Bristol Ridge XV, wich does not make sense at all, so the only plausible explanation is that the chip is working in the vicinity of 1GHz..

Arachnotronic · Oct 1, 2016

DrMrLordX said:
I agree with that, and I still think there's something seriously wrong going on with the Zen ES system being used to run GB4 in these leaks.

The hell it isn't. Intel leaked a bunch of consumer-level Conroe ES chips into the hands of community overclockers and got millions in free advertising as they wowed the crowds with how bad-ass were those chips. On launch they were all that and a bag of chips.

If 8c/16t Summit Ridge is matching similarly-clocked Broadwell in FP-heavy tasks like Blender, at the very least AMD could wipe out all the "oh no a Zen ES sucked at Geekbench it's terrible AMD resign now" type posts we're seeing expressly because AMD won't let anyone in the community play with one. All we get are scraps taken out of context from someone who may be blatantly crippling the performance of their test platform with who-knows-what kind of background tasks.

Maybe the boards just aren't "there" yet for proper Summit Ridge operation, I don't know. But the secrecy around Zen is really hurting AMD at this point.

XV is a lot stronger in SuperPi than SR or PD though. A lot stronger.

If the boards aren't there yet, how did AMD do the Blender demo?

Anyway, typically if you're sitting on something good and your current products aren't that good/competitive anyway, there is no harm in showing the prowess of your new chip.

That's why I'm so skeptical of that Blender demo -- if it can consistently edge out BDW-E in a lot of workloads, then why not hurt Intel's sales a bit by doing a reviewers workshop to show off the Zen ES?

Anyway, I would personally be surprised if Zen were measurably worse than Sandy Bridge in terms of perf/MHz, but then again a lot of us had high hopes for Barcelona and the 'Dozer lineage, too.

lolfail9001 · Oct 1, 2016

The most immune to RAM test in GB is AES test and it's evidently ~60% faster per clock than XV.

So, AMD delivered on promises (up to 40%), why are ya'll still disappointed?

DrMrLordX · Oct 1, 2016

Arachnotronic said:
If the boards aren't there yet, how did AMD do the Blender demo?

I was thinking the same thing. The only plausible reasoning is that AMD has some test systems in-house using custom reference hardware that is not intended for public release at all. It may be that board partners are dragging their feet or just don't want their stuff to get out either.

I don't remember how it was that the Conroe ES guys did it back in the day, but didn't some of them run Conroe on some existing LGA775 systems with hacked BIOSes? I just don't remember. With AM4 that isn't gonna happen.

Dresdenboy · Oct 1, 2016

Arachnotronic said:
If the boards aren't there yet, how did AMD do the Blender demo?

Anyway, typically if you're sitting on something good and your current products aren't that good/competitive anyway, there is no harm in showing the prowess of your new chip.

Blender was on SR/AM4, not a 2S server board.

But it seems, AMD still got sales with CZ (weren't revs/ASPs up there?) and will probably not want to turn the BRs (cheap 28nm) into inventory and the write them off like Llano.

The Stilt · Oct 1, 2016

The AM4 CRB "Myrtle" has been out over a year AFAIK. Both "Diesel" (2P SP3 server) and "Myrtle" were seen in the Blender video. That's not all of them thou.

Arachnotronic · Oct 1, 2016

Dresdenboy said:
Blender was on SR/AM4, not a 2S server board.

But it seems, AMD still got sales with CZ (weren't revs/ASPs up there?) and will probably not want to turn the BRs (cheap 28nm) into inventory and the write them off like Llano.

Carrizo SoCs do not serve HEDT, a market that AMD is largely out of. The parts that AMD does supply are typically sold as part of OEM systems to customers who don't know/care what CPU they're buying (and overwhelmingly buy Intel anyway).

There is no downside to showing off more Zen CPU benchmarks if the results are similar to the Blender demo. There is plenty of downside to public perception if it turns out to be a cherry picked one off demo that's not at all representative of true performance across a broad range of applications.

Phynaz · Oct 1, 2016

Arachnotronic said:
Carrizo SoCs do not serve HEDT, a market that AMD is largely out of. The parts that AMD does supply are typically sold as part of OEM systems to customers who don't know/care what CPU they're buying (and overwhelmingly buy Intel anyway).

There is no downside to showing off more Zen CPU benchmarks if the results are similar to the Blender demo. There is plenty of downside to public perception if it turns out to be a cherry picked one off demo that's not at all representative of true performance across a broad range of applications.

I've been saying it for months, if you've got something good coming you crow about it every chance you get. The lack of information says AMD doesn't have anything great to say.

The Stilt · Oct 1, 2016

And alternative explanation could be that the IPC itself is there, but a truly competitive performance is not. For example if the maximum frequencies would at the moment still be in the =< 3.2GHz region, AMD might be hoping the process maturation to bring additional improvements and hold further statements / releases before they truly know what kind of clocks they can achieve. At least some Zeppelin SKUs will be facing Kaby Lake, which will operate up to 4.5GHz at stock. So even if they had perfectly competitive IPC with BDW, SKL, KBL but their design for any reason could not hit above 3.2GHz or so... It would explain a lot. Anyway, if we stick with the official statements AMD has made so far (i.e 40% IPC improvement over Excavator), Zen won't be threatening the IPC of any Intel designs newer than Ivy Bridge anyway.

Abwx · Oct 1, 2016

Arachnotronic said:
That's why I'm so skeptical of that Blender demo -- if it can consistently edge out BDW-E in a lot of workloads, then why not hurt Intel's sales a bit by doing a reviewers workshop to show off the Zen ES?

Launch is still too far to disclose anything else that what they disclosed, and even then, i find that they have been quite informative when it comes to Zen and how they achieved the targeted perfs.

As for Zen outperforming BDW in Blender, isnt that somewhat expected when looking at the published diagrams.?.

FI Zen can do 2 FP MUL + 2 FP ADD in a cycle, so far Intel s HW is restricted to 2 FP MUL or 1 FP MUL + 1 FP ADD, dunno if they improved the thing with next gens but what is sure is that Zen has some arguments when it comes to FP, it s not any wonder that AMD explicitely pointed FP perf..

cdimauro · Oct 2, 2016

lolfail9001 said:
The most immune to RAM test in GB is AES test and it's evidently ~60% faster per clock than XV.

So, AMD delivered on promises (up to 40%), why are ya'll still disappointed?

Because this single test relies on specific AES instructions, so it only means how good the CPU is performing on... guess what... AES en/decryption task.

Which is clearly NOT representative of tons of other scenarios / kind of code.

Abwx said:
As for Zen outperforming BDW in Blender, isnt that somewhat expected when looking at the published diagrams.?.

FI Zen can do 2 FP MUL + 2 FP ADD in a cycle, so far Intel s HW is restricted to 2 FP MUL or 1 FP MUL + 1 FP ADD, dunno if they improved the thing with next gens but what is sure is that Zen has some arguments when it comes to FP, it s not any wonder that AMD explicitely pointed FP perf..

Well, if you consider that Zen has 4 128-bit FPU units whereas Intel CPUs have only 2 of them for this kind of code, I can say that no: it isn't what was expected. In theory Zen has TWICE the FPU processing power with 128-bit code, and with Blender it shown only a 2% advantage...

That's not event counting the other amount of resources (4 complex decoders, bigger uop-cache, double L1-I cache, 4 integer units, etc.) that Zen puts on that table compared to the competitors, which should put it in even better conditions. You can make a nice table and compare them.

bjt2 · Oct 2, 2016

The Stilt said:
And alternative explanation could be that the IPC itself is there, but a truly competitive performance is not. For example if the maximum frequencies would at the moment still be in the =< 3.2GHz region, AMD might be hoping the process maturation to bring additional improvements and hold further statements / releases before they truly know what kind of clocks they can achieve. At least some Zeppelin SKUs will be facing Kaby Lake, which will operate up to 4.5GHz at stock. So even if they had perfectly competitive IPC with BDW, SKL, KBL but their design for any reason could not hit above 3.2GHz or so... It would explain a lot. Anyway, if we stick with the official statements AMD has made so far (i.e 40% IPC improvement over Excavator), Zen won't be threatening the IPC of any Intel designs newer than Ivy Bridge anyway.

I don't buy the <=3.2GHz thing.
Zen has a 19 stages integer pipeline. Bulldozer wasn't disclosed, but anything between 15 and 20. So the Zen FO4 should be in the same ballpark of Bulldozer's
Polaris is moreless a shrink of the old architecture, and gains 10-20% clock from 28nm bulk to 14nm FF
Bristol Ridge has 3.8-4.2GHz quad core + GPU in 65W...
So...

cdimauro said:
Because this single test relies on specific AES instructions, so it only means how good the CPU is performing on... guess what... AES en/decryption task.

Which is clearly NOT representative of tons of other scenarios / kind of code.

Well, if you consider that Zen has 4 128-bit FPU units whereas Intel CPUs have only 2 of them for this kind of code, I can say that no: it isn't what was expected. In theory Zen has TWICE the FPU processing power with 128-bit code, and with Blender it shown only a 2% advantage...

That's not event counting the other amount of resources (4 complex decoders, bigger uop-cache, double L1-I cache, 4 integer units, etc.) that Zen puts on that table compared to the competitors, which should put it in even better conditions. You can make a nice table and compare them.

Last INTEL architecture have 4x256 memory units, versus 3x128 of Zen. In complex FPU calculations should win Zen (but only on 128 bit code or not FMAC). In easy FPU calculation, limited by RAM B/W, should win INTEL

I user regularly Matlab that has an auto parallelization feature. It uses it for complex calculus, like a transcendent function calculation for every item of a matrix, but it doesn't use it for simple calculations like matrix summation, because this is limited by ram B/W and not FPU resources...

cdimauro · Oct 2, 2016

AFAIK latest Intel's microarchitectures can do 2x256 bits loads and 1x256 bits store (per cycle).

EDIT: due to this, with 128-bit code basically you are wasting half L1D bandwidth.

bjt2 · Oct 2, 2016

cdimauro said:
AFAIK latest Intel's microarchitectures can do 2x256 bits loads and 1x256 bits store (per cycle).

EDIT: due to this, with 128-bit code basically you are wasting half L1D bandwidth.

But why does it have 4 memory ports/pipelines?
EDIT: moreover, why, although zen has more int and FP ports, intel has higher IPC (it seems that blender is only an exception)?

StrangerGuy · Oct 2, 2016

Phynaz said:
I've been saying it for months, if you've got something good coming you crow about it every chance you get. The lack of information says AMD doesn't have anything great to say.

I think you got it backwards, the amount of noise AMD is making about Zen is undermining their claims. Intel simply sent Anand actual chips to do previews while Apple doesn't say anything about the SoC performance until their release event. Both companies simply let their hardware do the actual talking.

lolfail9001 · Oct 2, 2016

StrangerGuy said:
I think you got it backwards, the amount of noise AMD is making about Zen is undermining their claims. Intel simply sent Anand actual chips to do previews while Apple doesn't say anything about the SoC performance until their release event. Both companies simply let their hardware do the actual talking.

You guys are both correct. AMD makes noise and provides little useful (sorry, but block layout is barely useful for end user) information in the same time.

Abwx · Oct 2, 2016

cdimauro said:
AFAIK latest Intel's microarchitectures can do 2x256 bits loads and 1x256 bits store (per cycle).

EDIT: due to this, with 128-bit code basically you are wasting half L1D bandwidth.

2 X 256b load only for AVX, for regular SSE2 it will be 2 x 128b loads..

cdimauro said:
Well, if you consider that Zen has 4 128-bit FPU units whereas Intel CPUs have only 2 of them for this kind of code, I can say that no: it isn't what was expected. In theory Zen has TWICE the FPU processing power with 128-bit code, and with Blender it shown only a 2% advantage...

2x the FPU ressources is useless if there s not enough instruction parralelism and that there s dependencies on top, besides the 2% advantage is what was shown by AMD, i wouldnt expect them to display too big of an advantage as it would be counter productive if done currently, intelligence would recommend to select a file and settings such that you win but without disclosing the real extent of the perfs.

cdimauro · Oct 2, 2016

bjt2 said:
But why does it have 4 memory ports/pipelines?

One is used only for address calculation, if I remember correctly.

EDIT: moreover, why, although zen has more int and FP ports, intel has higher IPC (it seems that blender is only an exception)?

This is the one million dollars question.

I don't know. They have very different microarchitectures.

Abwx said:
2 X 256b load only for AVX, for regular SSE2 it will be 2 x 128b loads..

That's more or less what I stated, except that also AVX 128-bit code has the same SSE2 limits.

2x the FPU ressources is useless if there s not enough instruction parralelism and that there s dependencies on top,

"Unfortunately" this isn't the case with Blender: dependencies aren't that much and it's easier to extract more instruction parallelism, because the executed code is more "linear" and "homogeneous".

Read: it's completely different from, let's say, an emulator, where you have the exact opposite scenario.

besides the 2% advantage is what was shown by AMD, i wouldnt expect them to display too big of an advantage as it would be counter productive if done currently, intelligence would recommend to select a file and settings such that you win but without disclosing the real extent of the perfs.

Who knows the real reasons, but it's quite strange that they continue to do not report any other benchmarks. Even because this gap can be easily filled by leaked benchmarks, that can give a bad reputation to the product.

bjt2 · Oct 2, 2016

Abwx said:
2 X 256b load only for AVX, for regular SSE2 it will be 2 x 128b loads..

2x the FPU ressources is useless if there s not enough instruction parralelism and that there s dependencies on top, besides the 2% advantage is what was shown by AMD, i wouldnt expect them to display too big of an advantage as it would be counter productive if done currently, intelligence would recommend to select a file and settings such that you win but without disclosing the real extent of the perfs.

2x FPU resources are good for SMT: if you have an intensive parallel FPU workload, if you factor the int+fp ports, AMD should have an advantage: 4 int + 4 fp versus 4 int/fp should do little for 1 thread, but very much for two threads... I will not be surprised if Zen will win in MT and continues to struggle in ST with blender, cinebench etc...

cdimauro · Oct 2, 2016

Isn't Blender a MT application?

Abwx · Oct 2, 2016

cdimauro said:
"Unfortunately" this isn't the case with Blender: dependencies aren't that much and it's easier to extract more instruction parallelism, because the executed code is more "linear" and "homogeneous".

.

That s a random statement contradicted by real numbers, if instruction parralelism in Blender was so good then the pipeline would be full with a single thread with all exe ressources saturated by this thread, you wouldnt see 50-60% gain with SMT on Intel CPUs despite only two SSE2 FP pipelines, so much for "linearity and homogeneity" of the code..

cdimauro · Oct 2, 2016

The question is quite simple: you cannot use all available ports all the time, even with such kind of favorable code.

bjt2 · Oct 2, 2016

cdimauro said:
Isn't Blender a MT application?

Yes, but ST performance is important because there are softwares that are ST...
Anyway if one processor shines in ST and one in MT, there will ever be two factions...

cdimauro · Oct 2, 2016

So, do you plan to see Zen running better with ST code, since with MT it doesn't seem to shine?

Arachnotronic · Oct 2, 2016

StrangerGuy said:
I think you got it backwards, the amount of noise AMD is making about Zen is undermining their claims. Intel simply sent Anand actual chips to do previews while Apple doesn't say anything about the SoC performance until their release event. Both companies simply let their hardware do the actual talking.

I think Phynaz meant that AMD should do what you are saying -- send a Zen chip to a popular website and let them go to town showing how it curb stomps Broadwell-E

If you follow the link to SA forums that Dresdenboy provided, apparently AMD is staying mum on performance of Zen and not providing benchmarks to its partners.

http://semiaccurate.com/forums/showpost.php?p=273707&postcount=3858

This is not how a company with a game changing chip acts.

AMD Ryzen (Summit Ridge) Benchmarks Thread (use new thread)

Lifer

Lifer

Golden Member

Lifer

Golden Member

Golden Member

Lifer

Lifer

Golden Member

Lifer

Member

Senior member

Member

Senior member

Diamond Member

Golden Member

Lifer

Member

Senior member

Member

Lifer

Member

Senior member

Member

Lifer