New Zen microarchitecture details

Page 54 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

krumme

Diamond Member
Oct 9, 2009
5,952
1,585
136
It is a design for servers first of all. Desktop and mobile is purely secondary at best.
They say so. If it wasnt for cloud and huge consolidation with facebooks google amazon imo they wouldnt stand a chance entering that market. Seriously Intel uses more r&d on packaging than amd entire cpu r&d budget and have huge technical sales support. That account for something unless you can do it yourself like the monster players.
But perhaps zen is leaner and smaller than we think and therefore also a good fit for laptops and consoles. They didnt bring big fat fpu so it seems to me there is more posibilities - its no k7 here.
 

The Stilt

Golden Member
Dec 5, 2015
1,709
3,057
106
Yeah, thanks for clarifying. Well, I am still upset. The regular consumer platform doesn't look that appealing to me (everything equal, Intel drivers > AMD drivers; a big thing for me) :/

We've had dual-channel for years. Might as buy an FX 8300-combo for $180 now. Performance upgrade isn't even going to be 2x, I bet. AM4 also seems like a dead-end to me. So sad. And it's not even out yet. What they are thinking.

No need to be upset about the memory. The bandwidth will be more than sufficient for a CPU :)
 

Magic Carpet

Diamond Member
Oct 2, 2011
3,477
231
106
No need to be upset about the memory. The bandwidth will be more than sufficient for a CPU :)
Well, I expected a huge leap in everything, possibly to rival X99 platform. Now it seems, it might just improve on the performance per watt front, compared to their older tech. Oh well. Will it at least match 4C/8T Kabylake?

You guys seem to be knowledgeable enough, discussing the ins and outs of the upcoming architecture.
 
Last edited:

guskline

Diamond Member
Apr 17, 2006
5,338
476
126
Ever heard the story about the boy who cried wolf?

Unlike Intel or nVidia, AMD has made several overly optimistic or even outright false statements regarding the characteristic or the performance of their products.

Also some of their recent claims are extremely bold and in order them to come true, a whole page worth of "notes" (conditions) are required in order to cover up their behinds from lawsuits.

nVidia screwed up with GTX 970 and that's why I will be paying extra attention to the information and the claimed specs of every single harvested SKU they release from now on. Just as an example.

Excellent post. May I add that I am extremely impressed with Dr. SU and her professional demeanor ( I wonder if she had some words "in private" with Joe Macri about his over the top overclocker's dream comments about Fiji).

I just looked at the top end AMD desktop processor, 9590 and it is $225 hot and uses an old platform.

ZEN, especially the top end one won't be cheap but I suspect will be a worthy and relevant successor.

I thought the ZEN brand was to cover the spectrum of AMD cpu needs from servers to laptops. The divergence will be apu vs desktop/server.
 
Last edited:

Dresdenboy

Golden Member
Jul 28, 2003
1,730
554
136
citavia.blog.de
More IPC claim craziness:
Feb 2016 pres: still 40% uplift, with footnote:
Based on internal AMD estimates for “Zen” x86 CPU core compared to “Excavator” x86 CPU core.

Now there is this at Computerbase as an answer to the reddit/current IR slide:
2-630.3336223149.png

(I instantly saw the different font sizes and placing)
http://www.computerbase.de/2016-05/amd-prozessor-bristol-ridge-zen/
 
Last edited:

majord

Senior member
Jul 26, 2015
433
523
136
Looked at the Cinebench R15 chart AMD provided in their PDF.

From the notes you can see that they used FX-7500 (Kaveri), FX-8800P (Carrizo) and FX-9800P ("Bristol Ridge") in the comparison. FX-7500 scores 179 in R15 MT (pclab.pl) while FX-8800P scores 234 (my result). For that part the chart is absolutely correct (Kaveri = 100%, Carrizo = 130%, "Bristol Ridge" = 150%). However between Carrizo and "Bristol Ridge" the things get interesting. According to the chart "Bristol Ridge" should score 15.4% higher than Carrizo, i.e 270 points in Cinebench R15 MT test. The chip would need to run ~3200MHz in order to score 270 points...

FX-9800P has the same TDP as FX-8800P 15W / 25W (base boost), however it doesn't support cTDP > TDP configuration. Since Cinebench R15 takes less 200 seconds to complete in MT mode, both of the chips will have the 25W STAPM limit available throughout the test. With 25W power budget Carrizo is able to sustain <2700MHz frequency in Cinebench R15. FX-9800P has higher base and boost frequencies than FX-8800P, however the maximum frequencies doesn't matter when the effective frequency is limited by the power budget. So where does Bristol Ridge find the additional power worth of 500MHz, since the silicon itself is identical? FX-9800P also runs at higher voltages than FX-8800P does.

"50% performance-per-watt generational uplift" would also suggest that AMD claims that Bristol Ridge (7th generation) has 50% higher performance per watt than Carrizo (6th generation), which sounds pretty bold to me. Should be pretty hard to achieve, unless Carrizo has some serious errata which is yet to reveal itself...

Consider the different RAM config - probably worth 1% , 1 or 2 % of BS, and you're down to , say 12-13% higher frequency required to match the claims close enough. Which leaves you closer to 3Ghz (or 300Mhz higher)

Given it's been a year since Carrizo was released, I don't think it's too far fetched to think they can't squeeze an extra 300Mhz out of it, with further refined power management on the new platform + a respin.

Back in the good old SOI days, we used to see these sort of improvements every 6mths from just a Metal Rev! :sneaky:


Anyway, i'm more curious about the lack of mention of BR on AM4? It's vaguely mentioned under PC section, but that's it.

Where is AM4? :/
 
Mar 10, 2006
11,715
2,012
126
Last edited:

deasd

Senior member
Dec 31, 2013
513
724
136
ROFL, seems they modified the slide quietly. No change of expectation if this slide is final, nothing to see here, move ahead.
 
Mar 10, 2006
11,715
2,012
126
ROFL, seems they modified the slide quietly. No change of expectation if this slide is final, nothing to see here, move ahead.

It's exactly what some of us were saying though. AMD was claiming "over 40% compared to previous generation" but now when the comparison is explicitly made relative to XV, it's the same 40% number from a while ago.
 

inf64

Diamond Member
Mar 11, 2011
3,685
3,957
136
Few things to notice regarding that slide:
In the first version we had platform names matched with "Significant performance leap expected". So Orochi 4M/8T ( 1st gen Bulldozer presumably FX8150 3.6/4.2Ghz) Vs Summit Ridge unknown clock (presumably low 3Ghz base 8C/16T, mid 3Ghz boost).

In the second version we have "Significant performance leap expected - 40% IPC vs EX" figure matched with core codenames. So Excavator core vs Zen core. The scaling is a bit changed but still does not correlate with 40% figure. It looks more like close to 2x improvement.

IF Zen is ~40% on average better than EX core in ST tasks(IPC) then it should be ~55-60% better than PD. When you run a good 8T benchmark/workload on FX8350 you get around 80% scaling due to sharing penalty while on Zen you should get a ~20% boost due to SMT (on 8C/16T part). So "in theory" 8C/16T Zen at the same clock as 8350 should score ~2.2x higher. Since PD has around 10% better IPC than BD it seems that Orochi would be ~2.4-2.5x slower. Since Orochi is ~3.8-4Ghz chip in a lot of MT workloads , ~3Ghz 8C/16T should be ~2x faster according to above "math" :D.

I still don't know how AMD got 40% on that "corrected" chart since it looks more like 80% better :S. They need to hire some better Excel experts :D
 

jhu

Lifer
Oct 10, 1999
11,918
9
81
I think it's funny we have a 1300+ post thread where the only conclusion is "wait for available silicon".

Personally I was tired of waiting and already bought something else.
 
Last edited:

nismotigerwvu

Golden Member
May 13, 2004
1,568
33
91
I read the number differently. Even for phenom 2 that was improved "Would you rather wait 15 cycles for a l2 of 512KB than 15 cycles for 2 times 6MB on the core 2"?
Take k7, the tacked on l2 was damn slow vs p3 on die. Well the k7 was stomping due to brute force all over the p3 but the cache was slooooow. Even xp didnt come up to speed.
I always assumed it was due to not only arch but Intel mastering process better in many ways?

Did you look at the articles posted? Phenom II didn't change the latency at all, still 3 cycles for L1 and 15 cycles for L2 which coincidentally was the same for Yorkfield as well. Sure, Nehalem provided an improved L2 but it also needed another cycle for L1 (offset by the jump in clockrate though).

On the Coppermine Thunderbird comparison they are literally neck and neck both in terms of average access in cycles and time. Sure each side had their own particular advantages (Coppermine was faster to L2 on a cycle by cycle basis, but Thunderbird had a smarter cache and had lower miss rates to offset this).

I'll cut some slack since we're talking about the finer details of 15+ year old tech, but only the Athlon "classic" had off chip cache and from Thunderbird on, it was full-speed on chip. Regardless, data suggests that the cache issues are really only a construction core issue and not a "30 year" issue for AMD. Cache density on the other hand has been an issue since the K8 years where AMD has demonstratively needed more transistors and die area for equal amounts.
 

SAAA

Senior member
May 14, 2014
541
126
116
...I still don't know how AMD got 40% on that "corrected" chart since it looks more like 80% better :S. They need to hire some better Excel experts :D

Well that 20% is a virtual Excavator octa vs original Bulldozer at the same speed (?) so the remaining 80% is all due to SMT and IPC uplift minus or plus any clockspeed change:

1.2 (Excavator) * 1.4 (IPC) * 1.2 (SMT) * 1 (same clocks)= ~2 (Zen)

But for example SMT could give 30% more and then you have a little regression in clockspeed or vice-versa.

Btw I totally ignored CMT regression that should be very little on Excavator anyway.

On Bulldozer it was almost an average of module/cores amount for single vs multithread results... that also makes sense and would explain why some scores are great, it acts more like 8 cores on integer code, then some are bad even against i5s.
This fact alone still makes me wish that AMD released just once a newer FX with 4 modules, just to keep up with slowly but steadily improving i7s. Or things like this happen:

I think it's funny we have a 1300+ post thread where the only conclusion is "wait for available silicon".

Personally I was tired of waiting and already bought something else.
 

sirmo

Golden Member
Oct 10, 2011
1,012
384
136
I still don't know how AMD got 40% on that "corrected" chart since it looks more like 80% better :S. They need to hire some better Excel experts :D
They are marketing slides, also the chart on the right doesn't have a scale and the scale is very important in a graph. The performance bar looks like it's not starting at 0 "performance". If they say 40% over Excavator than the chart is irrelevant.
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
Ever heard the story about the boy who cried wolf?

Unlike Intel or nVidia, AMD has made several overly optimistic or even outright false statements regarding the characteristic or the performance of their products.

And then after you write this they make a silent change to an already published slide deck. You must be feeling deja-vu at this point.
 

sirmo

Golden Member
Oct 10, 2011
1,012
384
136
And then after you write this they make a silent change to an already published slide deck. You must be feeling deja-vu at this point.
The change might be to address the Bristol Ridge improvement. I certainly didn't expect it over Carrizo. And perhaps neither did AMD.

AVFS is a pretty exciting tech, and my guess is this is mainly where they are getting their gains from in Carrizo and Bristol Ridge.

Also as someone above mentioned, I believe Lisa Su is the least hype stirring CEO AMD has ever had. I am quite impressed with her demeanor on investor calls.. we'll see.
 
Last edited:

krumme

Diamond Member
Oct 9, 2009
5,952
1,585
136
Did you look at the articles posted? Phenom II didn't change the latency at all, still 3 cycles for L1 and 15 cycles for L2 which coincidentally was the same for Yorkfield as well. Sure, Nehalem provided an improved L2 but it also needed another cycle for L1 (offset by the jump in clockrate though).

On the Coppermine Thunderbird comparison they are literally neck and neck both in terms of average access in cycles and time. Sure each side had their own particular advantages (Coppermine was faster to L2 on a cycle by cycle basis, but Thunderbird had a smarter cache and had lower miss rates to offset this).

I'll cut some slack since we're talking about the finer details of 15+ year old tech, but only the Athlon "classic" had off chip cache and from Thunderbird on, it was full-speed on chip. Regardless, data suggests that the cache issues are really only a construction core issue and not a "30 year" issue for AMD. Cache density on the other hand has been an issue since the K8 years where AMD has demonstratively needed more transistors and die area for equal amounts.
Yes.

"A trip down memory lane will cost you 107 ns on an original Phenom processor, 100 ns on an Athlon X2, and now only 95 ns on a Phenom II. "

"Compared to the Athlon X2’s 20 cycle L2, Phenom II looks pretty good, but now look at Penryn. Penryn’s 15 cycle L2 is the same speed as Phenom’s L2, but it’s 2-6x larger. "

And as usual not quite there:
"The original Phenom had the right idea, it needed a larger L3. Core i7 perfected that idea, and Phenom II took a step towards that."

Phenom 2 was behind both penryn that was comparable and core cache that was designed for quad. Thats how i read the article.

Ofcource in the article it was said that amd would also go for a fast l2 in bd. It ended at a catastofic 18 cycles as i recall and 65 for the l3. Thats not 16 years ago. Eventually they have even skipped the l3. Thats a way to solve the issues :)
 

Dresdenboy

Golden Member
Jul 28, 2003
1,730
554
136
citavia.blog.de
I think it's funny we have a 1300+ post thread where the only conclusion is "wait for available silicon".

Personally I was tired of waiting and already bought something else.
I think, many here are discussing for the fun, not only to make buying decisions (which most wouldn't do before seeing benchmarks anyway). Like with sports, nobody says: Wait till after the season or finals and you'll get to know the best team.

Edit:
Just noticed that XV now is at 2.3 grid steps instead of 2. If they stand for 0.5, then this means 1.15. But Zen doesn't exactly land at the 4th line. So are they plotting some benchmark results or just roughly eyeballing the bars?
 
Last edited:

krumme

Diamond Member
Oct 9, 2009
5,952
1,585
136
I think, many here are discussing for the fun, not only to make buying decisions (which most wouldn't do before seeing benchmarks anyway). Like with sports, nobody says: Wait till after the season or finals and you'll get to know the best team.

Edit:
Just noticed that XV now is at 2.3 grid steps instead of 2. If they stand for 0.5, then this means 1.15. But Zen doesn't exactly land at the 4th line. So are they plotting some benchmark results or just roughly eyeballing the bars?
Ask the random guy on upwork who made it.
Now half the internet is discussing the hell out of it.

Seriously i think this discussion of marketing graphs is way overdone.
 

SAAA

Senior member
May 14, 2014
541
126
116
Ask the random guy on upwork who made it.
Now half the internet is discussing the hell out of it.

Seriously i think this discussion of marketing graphs is way overdone.

Not totally overdone, in the semiaccurate forums they noticed a possible Zen die-shot in a modified slide:

3-1080.266527377.png


Lower left, there was an apu shot but now it doesn't look like anything known. It's quite large with possibly 8 cores/similar structures &#8594; first look at Zen?! (seeing zero die wasted on iGPU is so... sexy :wub:)
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,743
3,075
136
well it looks like it matches some of the info floating around on linked in, like distributed memory controllers ( 1 per 4 core block, top left bottom right ).
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
The wafer reminds me of bulldozer/pilediver.

Its a PR slide, they are not taking pictures of a brand new wafer to show it. They just pull up the standard marketing graphics material and add it.

Even their GPU dies are what, old VLIW GPU in slides.
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,743
3,075
136
The wafer reminds me of bulldozer/pilediver.

Its a PR slide, they are not taking pictures of a brand new wafer to show it. They just pull up the standard marketing graphics material and add it.

Even their GPU dies are what, old VLIW GPU in slides.

it looks nothing like orochi.......
 

senseamp

Lifer
Feb 5, 2006
35,785
6,187
126
I hope Zen is on par with Intel, or at least close enough to wage a price war. CPUs are commodity chips, and it would be good to see them priced accordingly.