New Zen microarchitecture details

Page 53 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Abwx

Lifer
Apr 2, 2011
10,947
3,457
136
Except you also keep bringing up Povray.

I linked two other renderers (assuming CB use none of those) and both show that the FX8350 is faster than IB.

You should rather question why some people keep bringing CB R15 in light of those comparisons, in short if the test does not show Intel in good light it s to be discarded but if it s AMD that is penalised then it s all good apparently.

Besides Cinema 4D is not a soft that is hugely used contrary to 3DS Max, but since it s "better" than PoVray no wonder that it s much used by a certain public for cross brand comparison, true that PoV is not ICC compiled and does not slaughter AMD.


flaming/baiting is not allowed here.
Markfw900
 
Last edited by a moderator:

jhu

Lifer
Oct 10, 1999
11,918
9
81
Besides Cinema 4D is not a soft that is hugely used contrary to 3DS Max, but since it s "better" than PoVray no wonder that it s much used by a certain public for cross brand comparison, true that PoV is not ICC compiled and does not slaughter AMD.

Again, have you used any of these programs? Have you even used a compiler? I have. There is no appreciable difference in performance between Povray compiled with ICC and GCC.
 

nismotigerwvu

Golden Member
May 13, 2004
1,568
33
91
For as long as i remember cache latency have been very bad for amd cpu. Its like 30 years and always bad. I really hope they deliver on this braggin here because whatever tricks they have if latency is still lagging its going to be very tough against intel portfolio. Intel is crusing in second gear. So it needs to perform and do it cheap and lean.

It has really only been a construction core issue in terms of latency.

Going back to K7 and P6 they were roughly on par with each other, with Intel having some advtanges and AMD having others, but the average data access was pretty consistent between them (http://www.realworldtech.com/willamette-update/4/)

Phenom I/II were on par with the Intel offerings of the time per cycle and operated at higher clockrates so they actually had an advantage here (http://www.anandtech.com/show/2702/5)
 

Abwx

Lifer
Apr 2, 2011
10,947
3,457
136
Again, have you used any of these programs? Have you even used a compiler? I have. There is no appreciable difference in performance between Povray compiled with ICC and GCC.

Let say that there s no more difference since a recent time...

POVRAY36.png



http://www.extremetech.com/computin...ners-over-amd-athlon-benchmarking-shenanigans
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,551
14,510
136
Guys, the last several posts are NOT about Zen, get back on topic, or I will lock this thread.
 

Dresdenboy

Golden Member
Jul 28, 2003
1,730
554
136
citavia.blog.de
Is what i want to talk too. Marketing was clear in to point that IPC measurements was about Core versus Core, not about processor versus processor. Maybe at the maximum the core with L2 included.

Here I wanted to be more specific about the resulting performance (freq_ratio×IPC_ratio). But as The Stilt said, the statement he heard was about IPC.
 

krumme

Diamond Member
Oct 9, 2009
5,952
1,585
136
It has really only been a construction core issue in terms of latency.

Going back to K7 and P6 they were roughly on par with each other, with Intel having some advtanges and AMD having others, but the average data access was pretty consistent between them (http://www.realworldtech.com/willamette-update/4/)

Phenom I/II were on par with the Intel offerings of the time per cycle and operated at higher clockrates so they actually had an advantage here (http://www.anandtech.com/show/2702/5)
I read the number differently. Even for phenom 2 that was improved "Would you rather wait 15 cycles for a l2 of 512KB than 15 cycles for 2 times 6MB on the core 2"?
Take k7, the tacked on l2 was damn slow vs p3 on die. Well the k7 was stomping due to brute force all over the p3 but the cache was slooooow. Even xp didnt come up to speed.
I always assumed it was due to not only arch but Intel mastering process better in many ways?
 

krumme

Diamond Member
Oct 9, 2009
5,952
1,585
136
A small and quick contender might also place some punches.

But it would also be interesting, what amount of R&D costs (less fab dev) has been used for K7 and P4.
If we look at bobcat core and the India team collecting the ip for zakate and making it ready for production with 70 people it shows smaller and simpler designs is not that ressource demanding.

I think part of bd failure was the desing complexity exploded in the implementation phase. It only looks good on a drawing desk.

If i recall you think zen is smallish for a big core. I think that can turn out to be vital for dev cost. I cant remember your estimates. But what is your current estimates vs the core line?
 

The Stilt

Golden Member
Dec 5, 2015
1,709
3,057
106
I looked up the figures AMD has posted for Steamroller (over PD) and Excavator (over SR) and the quoted figures / the used phrases seem to vary depending on to whom the information is intended to. Even when we are talking about the exactly same time period :sneaky:

In the NDA version of "AMD 2014 A-SERIES INTRODUCTION" sheet(1), dated to January 2014 AMD says: "Up to 20% IPC uplift (average ~10%) on up to 4 Steamroller CPU Cores."

In the public version of the same sheet(2) they simply say "Up to 20% IPC uplift on up to 4 Steamroller CPU Cores."

In the NDA version of "Tech Day 2015" slide(3) AMD says 4-15% higher IPC over Steamroller. In the ISSCC 2015 Excavator presentation(4) the figure is said to be 5%.

Since Lisa Su didn't specify Excavator by the name, it is pretty apparent that the "greater than 40% IPC uplift" expected by AMD is not over Excavator, but over Piledriver or Steamroller designs.

"Our Zen based CPU development is on track to achieve greater than 40% IPC uplift from our previous generation and we're on schedule to sample later this year.

01/19/2016 - Lisa Su (Earnings Confrence Call)

When Excavator is specified by the name, the quoted figures are lower every time:

40% more instructions per clock*.

*Based on internal AMD estimates for “Zen” x86 CPU core compared to “Excavator” x86 CPU core.

05/06/2015 - Mark Papermaster (2015 Financial Analyst Day - Foundation For The Future PPT/PDF)


Both Piledriver and Steamroller are available for desktop and server segments, while Excavator is not available in either segment (so far mobile only). Despite the name the Steamroller based Opterons were never a real server CPUs, unlike Bulldozer & Piledriver (Orochi) were. Just like Orochi die, the Zeppelin die is inteded purely for servers and desktops. Is it possible that AMD just "accidentally" happened to plot both Zeppelin and Orochi die to the same (unlabeled chart) in the "investor presentation sheet" dated to May 2016, I think not :sneaky: Orochi can mean either the o.g Bulldozer (Orochi Rev. B) or Piledriver (Orochi Rev. C), however Piledriver is the only Orochi based design still shipping (i.e current generation core).

"Greater than 40% IPC uplift"
can obviously mean anything higher than 40%, however I think practically we are talking about anything between 40.1 - 49.9%. I would also imagine that if the expected IPC uplift was e.g. 51.1% AMD would use "greater than 50% IPC uplift" term instead of making an understatement to hold back the expectations.

According to AMD Steamroller is 10% faster than Piledriver on average, and Excavator is 5% on average faster than Steamroller. That means that on average Excavator should be 15.5% faster than Piledriver. That's more or less correct.

If compared to Piledriver (Orochi): Zen has 21.3% - 29.8% higher IPC than Excavator on average

If compared to Steamroller: Zen has 33.4% - 42.8% higher IPC than Excavator on average

If AMD hadn't included Orochi to the investor presentation sheet, I would be certain that they are comparing it to Steamroller. Excavator in this context is IMO out of the equation, since it has not been released to desktop (yet) and will never be available in server platforms. Meanwhile the Zeppelin core as well as Orochi core are both intended purely for servers and desktops.

EDIT: The IPC figures are calculated over AMDs own figures (for generational improvement), with 40.1% (left) and 49.9% (right) improvements.

(1) http://i.imgur.com/BQZDIgj.jpg
(2) http://www.slideshare.net/AMD/amd-2...accelerated-processing-units-codenamed-kaveri (page 7)
(3) http://i.imgur.com/BYp4Zz2.png
(4) http://www.amd.com/en-us/who-we-are/corporate-information/events/isscc (page 13)
 
Last edited:

JDG1980

Golden Member
Jul 18, 2013
1,663
570
136
I expect Zen cores the have similar IPC as Ivy Bridge, on average. Faster in certain workloads but the average should be quite well matched.

Seeing how much additional effort AMD has taken in regards of Zen power and clock management and how strict the VRM requirement (based on the existing boards) are, I don't feel that Zen will be able to scale too well in any aspect.

I agree on the IPC estimate - I've been saying for a while we'll probably see something near SB/IB levels. And I think that will be good enough to win back some market share if the price points, core counts, and TDP are solid.

What I'm not quite clear on is why you're drawing pessimistic interpretations from the efforts into "power and clock management" and the strictness of VRM requirements. Couldn't that just be AMD trying to get the best perf/watt in the face of a fab process that isn't quite as good as Intel's? Why does it necessarily mean there will be lower headroom? Is there something I'm overlooking here?
 
Mar 10, 2006
11,715
2,012
126
The Stilt

Notice the context around the "greater than 40% IPC uplift over our previous generation." She is talking about the pure Zen CPU which succeeds the pure Piledriver CPU that was sold into servers/desktops.

Our second growth pillar is in the $15 billion plus datacenter and infrastructure markets, driven by our FirePro GPU's and next-generation server CPUs. Our Zen based CPU development is on track to achieve greater than 40% IPC uplift from our previous generation and we're on schedule to sample later this year.

We have secured several key design wins with global OEMs for our Zen based server CPU and believe we can rapidly reestablish our presence in the datacenter when we bring our new products to market in 2017.

She's clearly talking about the server market and Steamroller never made it into servers. The "greater than 40%" uplift is, IMO, in reference to Piledriver.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,686
1,221
136
She's clearly talking about the server market and Steamroller never made it into servers. The "greater than 40%" uplift is, IMO, in reference to Piledriver.
That ignores products cancelled through lack of effort.
KM = Komodo, Ten core, 3x64-bit DDR3, 3x8 PCIe Lanes. (AMD leaked slides implied multiple phy interpretation aka 5x16 cHT to 5x16 PCIe) // Komodo's shrink to 28nm was called Viperfish.
KM-Bx(Piledriver-Production) = FM2 Judo Platform(Dual-channel), C2012 Sepang Platform(Triple-channel), G2012 Terramar Platform(Quad-channel)
VF-Ax(Steamroller) = FM2+ Judo-L Platform(Dual-channel), C2012r1 Macau Platform(Triple-channel), G2012r1 Dublin Platform(Quad-channel)
- Trailblazing KM debug and solving problems proactively
- For outstanding contributions to the power management COE on Komodo and Trinity projects
- Responsible for full chip verification closure for AMD's Komodo processor
GK = Gecko, Sixteen core, 2x64-bit DDR3/DDR4, 2x20 cHT-PCIe Links, 1x8 PCIe Link, 2x16 cHT Links.
GK-Ax(Steamroller), xx-Ax(Excavator-20nm/14nm)((TD-Ax? Trident?)) = AM4, G4C
The above doesn't get quotes as I can't find it on the webs. Even though there are mentions of a 40-50 MB L3 cache for some unspecific AMD CPU in between 2014 and 2016. If you look up 15h 40h-4Fh, you'll get a nice image through google.

Forward Quote:
Yet what is new is that slide to describe 2x performance against Orochi, but still lack of detail.
The slide is meant to make you infer 2x. When it could in fact just be 30% or 40% over Orochi revision C.
 
Last edited:

deasd

Senior member
Dec 31, 2013
517
746
136
Some people in this forum are reading too much into the advertising, even if it's from official. I wonder why there's nobody read so much into Intel or NVIDIA advertising material. Funny.

Yet what is new is that slide to describe 2x performance against Orochi, but still lack of detail.
 

The Stilt

Golden Member
Dec 5, 2015
1,709
3,057
106
Some people in this forum are reading too much into the advertising, even if it's from official. I wonder why there's nobody read so much into Intel or NVIDIA advertising material. Funny.

Yet what is new is that slide to describe 2x performance against Orochi, but still lack of detail.

Ever heard the story about the boy who cried wolf?

Unlike Intel or nVidia, AMD has made several overly optimistic or even outright false statements regarding the characteristic or the performance of their products.

Also some of their recent claims are extremely bold and in order them to come true, a whole page worth of "notes" (conditions) are required in order to cover up their behinds from lawsuits.

nVidia screwed up with GTX 970 and that's why I will be paying extra attention to the information and the claimed specs of every single harvested SKU they release from now on. Just as an example.
 

rtsurfer

Senior member
Oct 14, 2013
733
15
76
Ever heard the story about the boy who cried wolf?

Unlike Intel or nVidia, AMD has made several overly optimistic or even outright false statements regarding the characteristic or the performance of their products.

Also some of their recent claims are extremely bold and in order them to come true, a whole page worth of "notes" (conditions) are required in order to cover up their behinds from lawsuits.

nVidia screwed up with GTX 970 and that's why I will be paying extra attention to the information and the claimed specs of every single harvested SKU they release from now on. Just as an example.

Yup unlike, AMD, nVidia is always honest.

All those awesome statements Jen Hsung makes during the launch of every new GPU are true.
No saying that 1080 2x faster than TitanX, then slowly sliding in, the only in VR scenario. That kind of stuff never happens.

They were completely honest about the pricing of the 1080 too.

Am I right..?




Sent from my Nexus 5X using Tapatalk
 

Dresdenboy

Golden Member
Jul 28, 2003
1,730
554
136
citavia.blog.de
The slide is meant to make you infer 2x. When it could in fact just be 30% or 40% over Orochi revision C.
Using a linear scale, the growth factor of the two bars lies in (1.0, 2.0]. If they use a log scale, then... hey, fun!

I wonder, that nobody asked, which Orochi model they refer to (lowest or highest clocks, same TDP, same # of cores).

Here's another example of applying factors to see some probable plausibility of them:
Code:
OR = 1 (maybe FX-8350, FX-8150, FX-8370, or just FX-8370e (95W))
SR = 2 (not that difficult vs. a 8370e)
Scaling factors:
SMT   = 1.25 (if this is a dedicated MT test with lots of cache friendly FP code)
noCMT = 1.25 (1/0.8)
IPC   = 1.4  (PD+40%)
Factors combined = 2.19
So if 2.0 should be reached instead, the clock needs to be ~10% lower.
 
Last edited:

Magic Carpet

Diamond Member
Oct 2, 2011
3,477
231
106
So, AM4 is dual-channel only? It doesn't feel like it's the performance solution.

What desktop/consumer socket to have 4/8 channels then?

Still PCI-E 3.0? Hmm. I don't see much value in any of that unless the price is really low. Overhyped. Seems targeted mostly for servers.

99921384415136105713.png
 
Last edited:

Magic Carpet

Diamond Member
Oct 2, 2011
3,477
231
106
It is a design for servers first of all. Desktop and mobile is purely secondary at best.
They might as well keep those slides to themselves or use internal channels to share that information with their potential clients (google, facebook, etc). I feel misled.

The majority of end users are not interested in server products. I guess, that market is dead. Oh well.
 
Last edited:

krumme

Diamond Member
Oct 9, 2009
5,952
1,585
136
So, AM4 is dual-channel only? It doesn't feel like it's the performance solution.

What desktop/consumer socket to have 4/8 channels then?

Still PCI-E 3.0? Hmm. I don't see much value in any of that unless the price is really low. Overhyped. Seems targeted mostly for servers.

99921384415136105713.png
For the average consumer what matters is total cost vs perf. Mb, ram plus cpu. Memory lanes cost, expensive chipset cost, and you have to redirect cost where you get most perf per $. If you can get double the cores at ib like perf for same cost as an skylake i5 i think its a pretty good offer with the new dx12 games comming up.

I dont know what amd defines as high end desktop market but if they want to sell in numbers that matters its not x99 class cost they should be aiming at. Its server market cost in my world and the product need far more validation for that market besides just more memory.
 
Last edited:

The Stilt

Golden Member
Dec 5, 2015
1,709
3,057
106
What desktop/consumer socket to have 4/8 channels then?


99921384415136105713.png

AM4 is the only DT / consumer platform for Zen based products. It is dual channel only. The three server platforms will support quad or octa channel, depending on the segment.

The information is this picture only applies to the top tier server products (in terms of memory channels and cores). So you do the math how it is put together :sneaky:
 

Magic Carpet

Diamond Member
Oct 2, 2011
3,477
231
106
AM4 is the only DT / consumer platform for Zen based products. It is dual channel only. The three server platforms will support quad or octa channel, depending on the segment.

The information is this picture only applies to the top tier server products (in terms of memory channels and cores). So you do the math how it is put together :sneaky:
Yeah, thanks for clarifying. Well, I am still upset. The regular consumer platform doesn't look that appealing to me (everything equal, Intel drivers > AMD drivers; a big thing for me) :/

We've had dual-channel for years. Might as buy an FX 8300-combo for $180 now. Performance upgrade isn't even going to be 2x, I bet. AM4 also seems like a dead-end to me. So sad. And it's not even out yet. What they are thinking.
 
Last edited: