New Zen microarchitecture details

Page 28 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

AtenRa

Lifer
Feb 2, 2009
14,001
3,357
136
Personally, as long as we dont have personal attacks, flame bates and trolling , I dont have any problem for people expressing their believes about ZEN performance, price and power consumption.

This is a Democratic forum after all and we all can express our opinions (within forum rules) ;)
 
Aug 11, 2008
10,451
642
126
Leaks are of sources of very good quality. If Fottemberg assumptions about Zen clockspeeds get concrete, Zen will be a real Intel Killer.

If a frog had wings, he could fly. Two huge assumptions here, and in every Zen thread apparently, even one started by an admitted April Fool's prank.

One is that AMD will match or come close to intel performance. The second is that *if* that performance is even close to the hype, that AMD will sell it at prices much lower than intel. I think the second assumption is even less likely than the first, especially initially, when Zen will be in short supply.
 

Maxima1

Diamond Member
Jan 15, 2013
3,515
756
146
Does it really make sense that AMD will be able to match/BEAT Intel across the board as some are claiming in light of the conditions outlined?

Yes. Keller is a god. XD
 

DrMrLordX

Lifer
Apr 27, 2000
21,620
10,829
136
Why do you believe ZEN will be on short supply at launch ?? genuine question no bate intended.

AMD has said that 2016 will see limited release with shipping "for revenue" in 2017. Do recall that some have chosen to interpret that as Zen not shipping in 2016 at all . . . not that I agree with them, mind you.
 

CentroX

Senior member
Apr 3, 2016
351
152
116
AMD has said that 2016 will see limited release with shipping "for revenue" in 2017. Do recall that some have chosen to interpret that as Zen not shipping in 2016 at all . . . not that I agree with them, mind you.

probably like fury x which had very limited supplies for a few months.
 

MajinCry

Platinum Member
Jul 28, 2015
2,495
571
136
AMD's current CPUs (and older) have a huge draw call deficit in comparison to intel's equivalents. While number-crunching performance may be the same, the backwards-compatible chipsets (legacy support, no modern performance techniques, etc) slay AMD in draw calls with traditional APIs (pre-Mantle).

To quote Boris Vorontsov:

but it's depends from many things, try to find tests runned with amd cpu to compare which are faster.
Vishera have absolutely the same bottleneck (and it's same old amd with higher freq), it's almost not scales from cpu frequency on amd after 3 ghz, the problem in not in cp itsef, but the platform cpu->motherboard->pci-e, the commands flow is much slower than on intel. Google "draw calls bottleneck", it's huge pain in the ass of developers (most real time strategy games suffer from it).

http://enbseries.enbdev.com/forum/viewtopic.php?f=2&t=1724&start=170#p24153


Marcurios
Somewhere in 2012 we did test here on the forum and all AMD cpus had 6-8.5 fps, while Intel 18-30 (23 is average). I don't remember if cpu similar to fx8320 by architecture was in there. About laptops i don't know, depends from motherboard mostly. in 2009 my producer tested heavy scene (draw calls i mean) with laptop with gf520 videocard and some Intel cpu (don't remember), it was faster than my athlon x3 2.9 ghz with gf9600. But it's for dx9 game, dx11 have improvements for this, but still draw calls (dips) problem is at the top of issues in games.

http://www.enbseries.enbdev.com/forum/viewtopic.php?f=2&t=4666&start=150#p66273

So there's definitely much AMD can do to increase gaming performance several times over, whilst only increasing general performance by ~40% over Excavator.

Whilst we are getting game engines with D3D12 and Vulkan, decent performance for older games would be most welcome; Skyrim sure as hell ain't going to disappear.
 

Doom2pro

Senior member
Apr 2, 2016
587
619
106
Alright then. I'll give you objective reasons why I decided to not wait for Zen.

Why are you comparing Zen's 40% IPC increase to Piledriver? The 40% increase is over Excavator, it's process independent, and AMD has been saying lately it's more likely 45% now.

Unless AMD is lying out their teeth, Zen should be WAY less than 10% behind Skylake...
 
Last edited:

monstercameron

Diamond Member
Feb 12, 2013
3,818
1
0
Why are you comparing Zen's 40% IPC increase to Piledriver? The 40% increase is over Excavator, it's process independent, and AMD has been saying lately it's more likely 45% now.

Unless AMD is lying out their teeth, Zen should be WAY more than 10% behind Skylake...
As a rule of thumb, always cut amds projections in half. Itll be more like 20% at best.
 

Doom2pro

Senior member
Apr 2, 2016
587
619
106
Dang, I feel bad for RussianSensation. Guy makes a good purchase, explains his purchase, and he gets ripped on by fans of a certain company.

Sad.

He's getting ripped because he's making inaccurate comparisons between an architecture that hasn't even been released yet and comparing it's IPC boost to the wrong architecture and ignoring that this IPC boost is process independent which ignores all the benefits of a shrink to 14nm and then he's using this already flawed logic to boldly assume Zen won't be able to compete.

He asked for it and he's getting it.

Funny.
 

MajinCry

Platinum Member
Jul 28, 2015
2,495
571
136
Why are you comparing Zen's 40% IPC increase to Piledriver? The 40% increase is over Excavator, it's process independent, and AMD has been saying lately it's more likely 45% now.

Unless AMD is lying out their teeth, Zen should be WAY more than 10% behind Skylake...

That's a bit much. Ain't the CPU situation as follows? :

Piledriver == Phenom II @ equal clocks
Phenom II @ 3.4ghz == Nehalem @ 2.7ghz (700mhz deficit)

Sandybridge = Nehalem + 40% performance
Sandybridge @ 3ghz == Nehalem @ 4.2ghz

Steamroller = Piledriver * 1.05 performance
Excavator = Steamroller * 1.05 performance
Excavator @ 3.4ghz == Piledriver @ 3.7ghz
Excavator == Nehalem @ 3.0ghz (400mhz deficit)

That places Zen squarely at Sandybridge performance, at the same clocks. Coupled with Zen having a newer node, we're going to have less power consumption, higher clocks and more cores.

I'd wager that Zen is going to be around Ivybridge performance, all things considered. And if AMD has rectified the draw call deficit, Zen's going to be pretty fancy.
 
Aug 11, 2008
10,451
642
126
Why do you believe ZEN will be on short supply at launch ?? genuine question no bate intended.

New process, unknown yields, new architecture, some chips diverted to servers. Pretty much every new product launch recently has had low availability initially. I would also assume if the product is as good as projected, they would want to launch as soon as possible, even if supply is not good.

Edit: and as another poster said, even AMD said availability would be limited at first and ramping up in 2017.
 
Last edited:

CentroX

Senior member
Apr 3, 2016
351
152
116
That's a bit much. Ain't the CPU situation as follows? :

Piledriver == Phenom II @ equal clocks
Phenom II @ 3.4ghz == Nehalem @ 2.7ghz (700mhz deficit)

Sandybridge = Nehalem + 40% performance
Sandybridge @ 3ghz == Nehalem @ 4.2ghz

Steamroller = Piledriver * 1.05 performance
Excavator = Steamroller * 1.05 performance
Excavator @ 3.4ghz == Piledriver @ 3.7ghz
Excavator == Nehalem @ 3.0ghz (400mhz deficit)

That places Zen squarely at Sandybridge performance, at the same clocks. Coupled with Zen having a newer node, we're going to have less power consumption, higher clocks and more cores.

I'd wager that Zen is going to be around Ivybridge performance, all things considered. And if AMD has rectified the draw call deficit, Zen's going to be pretty fancy.

It's funny because AMD was the driving force of fixing the draw call deficit with Mantle.
 

MajinCry

Platinum Member
Jul 28, 2015
2,495
571
136
It's funny because AMD was the driving force of fixing draw call deficit with mantle.

Eh...Even Intel has a hard time with draw calls. You can only throw so much horsepower at bloated, and horribly inefficient code.

I'm lookin' forward to when LOD is rendered useless, as far as CPU performance is concerned. We'll finally be able to have grand vistas, chock full of detail, and we ain't going to see a slideshow due to the CPU being overwhelmed.

Nothing wrong with console-level API efficiency. Hell, the Gamecube was faster at drawing objects than it was at culling them. With D3D9-11, it's the reverse; drawing objects is bloody slow.
 

MajinCry

Platinum Member
Jul 28, 2015
2,495
571
136
Dont thing so, SB is not more than 10-15% higher IPC vs Nehalem.

Admittedly, it was back in the early days of Sandybridge, when I subsequently upgraded to a Phenom II, that I read about the performance claims 'n' whatnot.

I remember being told by lads over on Tomshardware that Sandybridge is ~40% faster than Nehalem, but that could either be me mis-remembering, or being lied to.

Either way, even at the worst estimate, Zen's going to be pretty good. I just hope the draw calls get worked on, otherwise I'll just resort to skiving somebody's ancient 8 core Sandybridge xeon, when I've saved enough coppers.
 

AtenRa

Lifer
Feb 2, 2009
14,001
3,357
136
New process, unknown yields, new architecture, some chips diverted to servers. Pretty much every new product launch recently has had low availability initially. I would also assume if the product is as good as projected, they would want to launch as soon as possible, even if supply is not good.

Edit: and as another poster said, even AMD said availability would be limited at first and ramping up in 2017.

Agreed but I believe ZEN die size will not be that big as BD was. Im expecting no more than 200mm2. And since they would already use the 14nm LPP for 6 or more months manufacturing 120-232mm2 GPU dies on the same process at same fabs they will use for ZEN CPUs, they will have nice head start for better yields.

They also have to launch ZEN in such a way to make a big impact vs Intel counterparts. So I dont believe lower availability in the 2016 will affect its price.
Dont get me wrong, im expecting top models to get a premium, im not expecting the same Premium Intel is asking.
 

AtenRa

Lifer
Feb 2, 2009
14,001
3,357
136
Admittedly, it was back in the early days of Sandybridge, when I subsequently upgraded to a Phenom II, that I read about the performance claims 'n' whatnot.

I remember being told by lads over on Tomshardware that Sandybridge is ~40% faster than Nehalem, but that could either be me mis-remembering, or being lied to.

Well, SB is 40% faster (Performance) than Nehalem but only because of higher clocks both base and turbo. If you put both at the same 3.4GHz, the performance difference between Core i7 920 vs Core i7 2600K is not more than 10-15%.
 

Exophase

Diamond Member
Apr 19, 2012
4,439
9
81
Cough......sorry I can't even teach every single one on common knowledge of system architecture and programming.
Gaming is gaming, IPC is IPC, the former is used to test WHOLE platform performance(include GPU, system bus, memory, I/O), latter is just a unit of measurement, it is for programmer to measure some instruction latency. How can you test so-call IPC(not to mention you don't know what instruction you test) in a graphic/memory/I/O stressing circumstance? What tools had you used??
It seems so many people cannot figure out yet...... but I won't worry just wait things turn out.

IPC is of expressing results of a benchmark, not a theoretical or microarchitectural parameter. It is performance normalized per clock speed for some program running a fixed/specified instruction set.

We don't know usually the actual IPC of a benchmark because we don't know the number of instructions it executed. But if we run the same program on two different CPUs (assuming the runtime is loading the CPU 100% w/o stalls from other things) and a fixed/known frequency is used one can measure relative IPC. Saying X performs with Y% faster IPC in some bench is perfectly valid.
 

Exophase

Diamond Member
Apr 19, 2012
4,439
9
81
IMO, this is going to be Sandy Bridge vs 'Dozer all over again. However I predict that this time, Zen will have good IPC but it will clock significantly worse than Skylake/Kabylake/etc.

Zen is clearly targeted at the server market where high core counts + more modest frequencies are the norm; think Broadwell-EP.

I think those of you expecting Zen to be able to hit Haswell+ IPC and the crazy clocks that Haswell can hit will be disappointed. We'll see, though. Looking forward to the flames from the usual suspects of course.

I don't think Zen will be an Apple-tier core out of the bat, but there's something to be said for trading peak single threaded performance for efficiency.

If you're running something like a Xeon E5-2699 v4 at heavy load almost all the time then you'd probably have benefited from a uarch were optimized to hit a max of ~2.5GHz instead of ~4.5GHz. I'm sure Zen will try to push this angle. Much unlike the CON cores which were the polar opposite.. they pulled all the stops to design a > 5GHz uarch then put it in servers that were far below that target...

If Zen has a good efficiency tradeoff it could also be reasonable in low power applications like tablets that would never run at 3.5+GHz even with turbo.

Too bad AMD's execution in tablets has been very poor.
 

Nothingness

Platinum Member
Jul 3, 2013
2,402
733
136
IPC is of expressing results of a benchmark, not a theoretical or microarchitectural parameter. It is performance normalized per clock speed for some program running a fixed/specified instruction set.

We don't know usually the actual IPC of a benchmark because we don't know the number of instructions it executed. But if we run the same program on two different CPUs (assuming the runtime is loading the CPU 100% w/o stalls from other things) and a fixed/known frequency is used one can measure relative IPC. Saying X performs with Y% faster IPC in some bench is perfectly valid.
Even that isn't necessarily valid: if the benchmark takes different code paths depending on various things the number of instructions will vary (this somehow goes along with your remark about same ISA). Also some of the JS benchmarks show huge variations in IPC depending on calibration.

Another funny things of IPC (unrelated to what you say) is that poorly optimized programs tend to have higher IPC :biggrin:
 

AtenRa

Lifer
Feb 2, 2009
14,001
3,357
136
Well here we count IPC as the Single Thread performance of the CPU at the same clock.

That is more in line with CPI but IPC is what was used all along, so we stick with that.
 

Glo.

Diamond Member
Apr 25, 2015
5,705
4,549
136
Intel has dedicated massive, highly skilled teams to building upon its very high IPC/high frequency micro-architecture for about a decade since the launch of Conroe. Each new architecture has brought steady improvements in performance-per-clock while clocks have remained largely flat and performance/watt has moved up.

This is no small feat, no matter how much people want to belittle the "modest" perf/clock gains Intel delivers with each new generation.

Here we have AMD, which has substantially fewer resources than Intel and has suffered significant brain-drain over the last decade. They are doing a brand-new "from scratch" architecture, with far fewer engineers and, frankly, I doubt that the "best of the best" have stuck around as AMD has imploded (just look at LinkedIn and you will see what I'm talking about). They've moved to the likes of Apple, Qualcomm, and even Intel.

This new architecture is expected to bring significant performance-per-clock enhancements (+40% over XV if AMD is to be believed) and will be built on Samsung's mobile-centric 14LPP manufacturing process.

Does it really make sense that AMD will be able to match/BEAT Intel across the board as some are claiming in light of the conditions outlined? I think something will be "traded off" here. And I think the trade-off will be clock speed, assuming that the IPC gains are true.

For the server market (which is what Zen is targeting), this isn't a bad trade-off, the Xeon chips from Intel don't exactly come at super high clock speeds either. But for enthusiast desktops, this trade off may make the Zen HEDT chips less desirable than their Intel counterparts.

This post is just justification for skepticism, not real arguments. Keller was one of people who designed A6 and Macroslalar architecture of Apple A-Series. I think if he had gotten enough freedom from AMD he was able to design competitive architecture.

This does not mean that I do disagree with you. I think just at this point both of possibilities: that Zen will be good, or bad - are equally possible. If you ask me for my personal opinion it will be good enough to gain enough traction, but I do not think that it would make me switch to AMD CPUs. But who knows?
 

Exophase

Diamond Member
Apr 19, 2012
4,439
9
81
Even that isn't necessarily valid: if the benchmark takes different code paths depending on various things the number of instructions will vary (this somehow goes along with your remark about same ISA). Also some of the JS benchmarks show huge variations in IPC depending on calibration.

Good point. I meant to say it in the post but I didn't make it clear at all; when I said same program I meant the stuff that's executed dynamically. So same control flow decisions, same working set input etc. Completely deterministic and identical execution.

Also when I say instructions executed I mean instructions retired, so not including mis-speculation. And of course instructions at the defined and common architectural level, so not counting uops or fused instructions or whatever.

Another funny things of IPC (unrelated to what you say) is that poorly optimized programs tend to have higher IPC :biggrin:

Well, on the one hand vectorizing tends to reduce IPC while increasing performance. On the other hand, making your code more cache, register, and prefetch friendly and have more predictable branches tends to increase IPC while increasing performance.

Well here we count IPC as the Single Thread performance of the CPU at the same clock.

That is more in line with CPI but IPC is what was used all along, so we stick with that.

IPC and CPI are the reciprocals of each other, I don't know why they'd mean fundamentally different things. There's been a trend to use IPC to mean perf/MHz even when the instructions are different but I don't really like this and try to avoid it.