[Techpowerup] AMD "Zen" CPU Prototypes Tested, "Meet all Expectations"

Page 23 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Where do you think this will land performance wise

  • Intel i7 Haswell-E 8 CORE

  • Intel i7 Skylake

  • Intel i5 Skylake

  • Just another Bulldozer attempt


Results are only viewable after voting.

Azuma Hazuki

Golden Member
Jun 18, 2012
1,532
866
131
Seems like the Zen hype train is suddenly grinding to a screeching halt...sad, huh? I wonder if NostaSeronx's predictions of 22 and 14nm Harvester/Crane cores are true...and if they'll end up being FASTER than Zen.

Ever since Orochi (BD ver1) I've been saying that AMD's problem is caching and interconnects. The cores are powerful but always hungry. Remember Hammer, where HT was a squidload faster than Intel's FSB, and what a difference that made? AMD needs to do that again. It's not going to, and its caches are probably gonna suck just as much as ever, but that's what it needs. :(
 

cytg111

Lifer
Mar 17, 2008
26,734
16,017
136
This is nuts, some admin should lock it.
No new infomation on zen will come of this bickering thats for sure.
 

cytg111

Lifer
Mar 17, 2008
26,734
16,017
136
Let me be clear. If AMD dies, VIA quits (or even Intel stops VIA due the Chinese influence) and Rockchip and Spreadtrum will abort that Frankenstein permanently.

Only one positive from that scenario IMO and that is that Intel would be 'free' to implement something like IA64 across the board.
"Only when there is one, there can be none"
 

Tuna-Fish

Golden Member
Mar 4, 2011
1,689
2,584
136
I do. For every moronic loon that comes in here, making retarded predictions of Zen having Skylake or higher IPC, there should be someone else laughing at and making fun of them, as Nosta was doing here.

You should take the 40% IPC at face value, (for some benchmark). IPC isn't like clock speed or power, there are no surprises. You design the core for a certain point and you can test your design on RTL simulations that provide the exact same IPC as the real thing. Everybody knows the exact cycle timings that their core has for a lot of real benchmarks years before the CPU goes into production.

The big question is, at the IPC AMD intends to hit, what kind of clock speed can they achieve? How much power does their desing burn for some amount of work done? Because that is something that the simulations don't show well.
 
Last edited:

looncraz

Senior member
Sep 12, 2011
722
1,651
136
My simulation puts it at less than Ivy Bridge.Scores won't be 1% but more around 30% lower in AVX2, and 2x lower in AVX512. You seem to have missed the "Native" movement. AVX512 will be used a lot more than you think.10nm FinFETs => more chips => lower cost. The equivalent Intel product will be two full shrinks ahead. Cannonlake 8-core SoC = Zen 8-core CPU, Lisa Su: "AMD is not a low price company anymore."The SMT performance hit will be around 30+% to a maximum 50%.We are looking at a 4x pJ differential between 20nm FinFETs @ SamFoundries and 10nm FinFETs @ IntelFoundries. A 25w Cannonlake 8-core SoC can be compared to a 95w Zen 8-core CPU and beating it.

Ugh, so much unsubstantiated nonsense.

Have you even seen the Zen unit instruction assignments?

http://looncraz.net/ZenAssignments.html

It actually has some significant benefits when compared to Haswell (but also some deficits, naturally).

http://looncraz.net/HswAssignments.htm

Zen can do twice as many shifts and rotates and LEAs. It should match it in movs and branching and just about everything else of consequence.

On the FPU side, Zen can do divisions at once, whereas Haswell can only do one. It can do 50% more fadds as well.

In addition, it looks to be able to concurrently issue AGU, ALU, and FPU instructions freely, unlike Intel's shared port design (though this is a minor benefit for most workloads, it is potentially quite useful for context-switch-free SMT, assuming they went that route).

Intel has had very little luck reducing power usage below Sandy Bridge's power level. There are too many fixed costs in that regard. Cannonlake will probably be a minor improvement.

I doubt Zen will beat Haswell, I don't think AMD has much of a chance matching Intel's cache system (which is where Intel's main advantages are derived).
 

looncraz

Senior member
Sep 12, 2011
722
1,651
136
I do. For every moronic loon that comes in here, making retarded predictions of Zen having Skylake or higher IPC, there should be someone else laughing at and making fun of them, as Nosta was doing here.

Please don't make fun of loons. :thumbsup:

Zen will certainly not have Skylake IPC. Haswell is the target, and is exactly where it will land if it has exactly 40% IPC improvement. Zen will probably not hit 5GHz, though.
 

TechGod123

Member
Oct 30, 2015
94
1
0
I do. For every moronic loon that comes in here, making retarded predictions of Zen having Skylake or higher IPC, there should be someone else laughing at and making fun of them, as Nosta was doing here.

No one is expecting Skylake. We expect Haswell.
 

cytg111

Lifer
Mar 17, 2008
26,734
16,017
136
No one is expecting Skylake. We expect Haswell.

what the .. the delta from haswell to skylake, - you guys believe that
1. 40%
2. a "comical ali" - kind of statement, everything is good, we're winning the war.
equals "haswell ipc" but not "skylake ipc" ? I truely dont get it.
 

mrmt

Diamond Member
Aug 18, 2012
3,974
0
76
Of course it would be very bad, if the mostly CPU veterans who worked on Zen didn't find a way to create something cool after beeing given full freedom in designing it and after years of evolution of their internal tools and work processes and grown experience. :)

As if things depended exclusively of these veterans... First AMD management would have to correctly feed the market drivers to the development team to start the product, the same management that wrecked the company in the last couple of years, then these veterans would have to lead teams that needed to be funded and staffed in the middle of the biggest cash crunch in AMD's history, and then the subpar foundry will have to deliver a production node on time and on specs, something unheard of in the entire Globalfoundries history.

I wouldn't think it's bad at all if they fail again in developing a high end processor. These guys are really working under extreme duress and under extremely tight funding condition, if they fail it will be no surprise, and no shame at all.
 

DrMrLordX

Lifer
Apr 27, 2000
23,183
13,270
136
I do. For every moronic loon that comes in here, making retarded predictions of Zen having Skylake or higher IPC, there should be someone else laughing at and making fun of them, as Nosta was doing here.

What loons, and what "retarded" predictions? Nosta isn't making fun of anybody, he's running interference to get potential buyers off Zen 'cause he has a vendetta against the guy who hired Keller and was apparently instrumental in EoLing the Construction line of cores. Nosta is also publicly hinting that AMD is infringing on Apple patents by releasing Zen.

The only estimation of Zen's performance to which I've paid any attention have been:

1). Looncraz's own exhaustive examination of instruction latency (admittedly, I don't grok all of it but he's done his homework)
2). AMD's claims of 40% increased IPC over XV
3). The Stilt's fixed clockspeed benchmark results in fp workloads

Now you either believe AMD's claims or you don't. I'm cautiously optimistic. Nobody seems to understand how good is AMD's current design (XV) given that it features low clockspeeds, only comes in 2M configurations, and is hobbled by throttling/poor treatment from most/all OEMs. A 4m/8t XV, right now, would kick some serious butt at in 8-thread scenarios, assuming it could reach even Kaveri-like clockspeeds, though again, it's not clear what socket would do it justice, nor is it clear what process AMD would have used to release such a chip. It's also unclear what AMD would have done to cope with the L3 size/speed problems that bedeviled Piledriver and Bulldozer. L3-less XV would be about useless for MP scenarios, so you'd have a server-class chip restricted to 1P servers, workstations, and desktop duty. Not good.

Things could easily go wrong with Zen, such as bugs from rushing certain aspects of the design, or yield problems from the process leading to lots of flaws/low clockspeeds, or poor voltage scaling past some critical clockspeed (3.4-4.0 GHz) even with good yields, or . . . whatever. It's too early to tell. Nobody knows what it's GOING to be, but for you or anyone else to come in here, read what people are saying, and then point fingers at people and call them names even after they have openly admitted that things might not be all rosy for Zen is poor form.

Yea, kind of ironic that those making the biggest criticisms of other's "simulations", satirical or not, are the first to promote/accept the AMD simulation that showed 40% ipc gain.

What simulation? Nobody has released any simulations. And who is promoting or accepting anything? We have people looking at 40% over XV and saying "yeah, that's Sandy Bridge level right there", like they don't even know (or care) how fast XV really is, even in fp code which is classically the weakness of Construction cores. If there's anything that makes zero sense, it's people who are just being downers 'cuz Bulldozer was overhyped.

All we have, beyond some design hints, is an assurance from AMD of a 40% IPC gain. We have some XV numbers, so do the math. Either AMD is right or not. A lot is riding on that prediction, though in AMD's defense, they under-estimated Zen's IPC increase over SR. If AMD is over-estimating the IPC gain, or if they're casually ignoring clockspeed/voltage scaling problems, then Zen may flop.

Seems like the Zen hype train is suddenly grinding to a screeching halt...sad, huh? I wonder if NostaSeronx's predictions of 22 and 14nm Harvester/Crane cores are true...and if they'll end up being FASTER than Zen.

Again, please don't take Nosta seriously. How many times has he just been flat-out wrong? Has anyone even tracked the accuracy of his statements?

Mock AMD for their Bulldozer fiasco all you wish, but if anyone has a worse track record than AMD, it's Nosta. It was sort of cute/annoying when he was babbling on about AMD's latest design as if it would somehow threaten Intel, but now he's turning on the company over which he obsesses and feeding negative public sentiment that has no basis in fact.

Ever since Orochi (BD ver1) I've been saying that AMD's problem is caching and interconnects. The cores are powerful but always hungry. Remember Hammer, where HT was a squidload faster than Intel's FSB, and what a difference that made? AMD needs to do that again. It's not going to, and its caches are probably gonna suck just as much as ever, but that's what it needs. :(

Zen has completely redesigned caches. They're going from exclusive to inclusive. It's a major part of what differentiates Zen from XV and its predecessors. I agree that AMD's cache performance has been poor since forever, and ever since Intel went with IMCs on Nehalem, AMD's IMC performance has lagged behind as well. If Stilt is right and Bristol Ridge's IMC tops out at DDR4-2400, then that is an issue that seriously needs correction as well. Summit Ridge and Raven Ridge need to do better than that. DDR4 launched some time ago with DDR4-3200 modules on the open market, and DDR4-4000+ is out there in the wild right now. Not to speak of the fact that most of AMD's IMCs underperform running the same memory standard as Intel memory controllers.

I don't think HT is a problem right now, though. Maybe for MP configurations it is . . .
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,815
1,294
136
I swept through the internet looking at really really old documentation. The original Bulldozer[K9];

Cores:
4 Integer ALUs[2 Branch/Popcnt + 2 IMUL/IDIV]
3 Integer AGUs[2 Read/1 Write]
Sorted 80-entry PRF

Floating point :
4 VALUs[2 ADD/MUL + 2 ADD/MUL/DIV/SQRT]
Sorted 144-entry PRF

45nm FDSOI
21 pipeline stages
25 FO4

Sorted PRF => Banked PRFs // Sorted PRFs are really easy to put register renaming on. So, there is one feature AMD dropped when downscaling the core.
 
Last edited:

myocardia

Diamond Member
Jun 21, 2003
9,291
30
91
No one is expecting Skylake. We expect Haswell.

Really? Then care to explain this person who has done it multiple times, in just the past few pages? BTW, if Haswell IPC were an easy thing to have, AMD would have had it 10 years ago, as would have Intel.;)

Secondly, a 4c/8t Zen should require only a clockspeed of ~3.8 GHz to match the 4.8 GHz 6700k in R10.

and adding 40% to his Cinebench R10 numbers would put Zen ahead of Skylake per clock (assuming the same number of cores).
 

TechGod123

Member
Oct 30, 2015
94
1
0
Really? Then care to explain this person who has done it multiple times, in just the past few pages? BTW, if Haswell IPC were an easy thing to have, AMD would have had it 10 years ago, as would have Intel.;)

You've completely misunderstood me but okay.
 
Aug 11, 2008
10,451
642
126
Really? Then care to explain this person who has done it multiple times, in just the past few pages? BTW, if Haswell IPC were an easy thing to have, AMD would have had it 10 years ago, as would have Intel.;)

Yea, I am just amazed at those who say, oh, AMD cant miss Haswell IPC, because we know so much about cpus now, and go into page after page of theoretical explainations about how it cant miss. Guess intel must not have been privy to this information, since it took them (with many times the resources of AMD), several generations to reach Haswell IPC. But of course, AMD somehow magically will do it in one quick fell swoop.
 

looncraz

Senior member
Sep 12, 2011
722
1,651
136
Yea, I am just amazed at those who say, oh, AMD cant miss Haswell IPC, because we know so much about cpus now, and go into page after page of theoretical explainations about how it cant miss. Guess intel must not have been privy to this information, since it took them (with many times the resources of AMD), several generations to reach Haswell IPC. But of course, AMD somehow magically will do it in one quick fell swoop.

You have to understand where you begin to know where you should expect to end.

Each Excavator core, using just two ALUs and two AGUs, has a higher than Core 2 (Penryn) performance. This feat could not be accomplished by AMD with 3 ALUs and 3AGUs in the Phenom II CPUs.

Now, take the advancements AMD has made to their cores. From the ALU and AGU improvements, branch prediction improvements, FastPath, and decode improvements, etc.... take all those things and realize that Excavator is only 40% slower than Haswell per core, per clock, even with all of the disadvantages afforded to it by having half the available computational resources - ignoring floating point for now. Now, double Excavator's execution resources to match those of Haswell.

In a perfect world, Zen would be 10~20% faster than Haswell since AMD is already squeezing relatively more performance from fewer resources. In reality, scaling with additional hardware resources doesn't work this way. Even if you have no other negative consequences from the addition of the resources, there are only so many instructions which can run at the same time. Adding one ALU would add 25% more performance in most cases, if properly utilized. Adding another will bring another 10%, with the random 20% boost in certain specific instruction mixes, and is more useful if it can do LEA and/or mov (which Zen's can).

From here, performance is dictated by many variables, all of which must come together perfectly. You need to be able to feed each execution unit and get the data through the core as quickly as possible. This is a complicated task. The cache system needs to be up to the task, the decoders, scheduler(s), the TLB(s), etc. all need to be up to the task.

Finally, a great deal of code has jumps, calls, etc. which means you have to then read another area of memory to begin execution. So it becomes very difficult to read enough instruction in at once that will all be executed. So you decode 12 uops, execute 8, and toss the other 4 since you have to read a new memory location (which you will hopefully have cached). AMD has shown that their front-end is actually quite strong. They used a single Zen-like front-end on Bulldozer and Piledriver to great effect, powering four AGUs, four ALUs, and a FlexFPU. It can handle Zen's 4+2+4 with little more than FastPath optimizations. It can also handle two threads concurrently very well, something which I suspect AMD will have leaned upon for their SMT implementation.

From what little we know, AMD has done the most central of all these changes (double ALUs, extra FPU pipes), and has stated that they have done the second most critical component to performance that they don't already have solved in existing products: cache latency. Their stated figure of 40% IPC is certainly achievable, and everything we've officially seen from Zen indicates that they are going the EASIEST route to achieve this goal.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,362
136
You have to understand where you begin to know where you should expect to end.

People here forget about that and others choose to ignore it.

This is why some people here find it extremely difficult for AMD to increase IPC by 40% over Excavator because they forget how different ZEN architecture is to Bulldozer.

Others choose to ignore that just because.........
 

Erenhardt

Diamond Member
Dec 1, 2012
3,251
105
101
Their stated figure of 40% IPC is certainly achievable, and everything we've officially seen from Zen indicates that they are going the EASIEST route to achieve this goal...

... and that is where Zen+ comes in.
All fits.
 

DrMrLordX

Lifer
Apr 27, 2000
23,183
13,270
136
Really? Then care to explain this person who has done it multiple times, in just the past few pages? BTW, if Haswell IPC were an easy thing to have, AMD would have had it 10 years ago, as would have Intel.;)

I'll explain myself thank you very much.

Yea, I am just amazed at those who say, oh, AMD cant miss Haswell IPC, because we know so much about cpus now, and go into page after page of theoretical explainations about how it cant miss. Guess intel must not have been privy to this information, since it took them (with many times the resources of AMD), several generations to reach Haswell IPC. But of course, AMD somehow magically will do it in one quick fell swoop.

Seriously? AMD already has Haswell IPC. What are you going on about?

Here's a 4.8 GHz Haswell running Cinebench R10:

intel_i74970k-38.jpg


And here's that 3.4 GHz Carrizo running the same thing:

052237f1_CZ-CB10-3.4G-Static.png


So 4c/8t Haswell @ 4.8 GHz manages a score 36559. If AMD could make a 4m/8t XV @ 4.8 GHz - which they CAN NOT, and which is the primary reason why Intel is kicking their ass - they would score 37118. Throw stuff like AVX2 into the mix, and Haswell pulls ahead of XV running xOP code. For example, a 4.5 GHz 4790k can clear y-cruncher 500m in about 56s, while a 4.8 GHz 4m/8t XV would probably take ~100s to do 512m. So, advantage Haswell.

XV could already do the job, if they actually had the ability to make a market-ready chip that made sense for a large enough customer base for people to want to pay for it. They can't and won't bother with trying to adapt XV to 22nm anything or 14nm LPP. Zen is 40% faster, so why bother?

Of course, if Zen ISN'T 40% faster, then it'll be a fiasco. It'll also be quite a problem if clockspeeds are too low. But if you want Haswell IPC, AMD has it right now. They just don't have Haswell levels of performance, and Bristol Ridge sure isn't going to get us there either.
 

Xpage

Senior member
Jun 22, 2005
459
15
81
www.riseofkingdoms.com
Yea, I am just amazed at those who say, oh, AMD cant miss Haswell IPC, because we know so much about cpus now, and go into page after page of theoretical explainations about how it cant miss. Guess intel must not have been privy to this information, since it took them (with many times the resources of AMD), several generations to reach Haswell IPC. But of course, AMD somehow magically will do it in one quick fell swoop.

reverse engineering SB should help immensely, they have had years to do so
 

coercitiv

Diamond Member
Jan 24, 2014
7,464
17,829
136
And here's that 3.4 GHz Carrizo running the same thing [snipped image]

So 4c/8t Haswell @ 4.8 GHz manages a score 36559. If AMD could make a 4m/8t XV @ 4.8 GHz - which they CAN NOT, and which is the primary reason why Intel is kicking their ass - they would score 37118. Throw stuff like AVX2 into the mix, and Haswell pulls ahead of XV running xOP code. For example, a 4.5 GHz 4790k can clear y-cruncher 500m in about 56s, while a 4.8 GHz 4m/8t XV would probably take ~100s to do 512m. So, advantage Haswell.
Here's CB10 on 4c/8t Haswell running at 3.4Ghz.

WhwzINr.png
 

NTMBK

Lifer
Nov 14, 2011
10,520
6,037
136
What are YOU talking about? XV isn't even close to Haswell in ST perf/clock, as coercitiv showed.

He's comparing an AMD module to an Intel core... kind of a dubious comparison, as it ignores the massive single thread performance deficit.