[Deustche Bank Conference] AMD's New x86 Core is Zen, WIll Launch WIth K12

Page 11 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

NTMBK

Lifer
Nov 14, 2011
10,421
5,715
136
Why did they integrate the memory controller with their CPUs ahead of Intel but seem to drag their feet pulling the GPU and chipset features into the CPU?

On this point- AMD did have plans to improve integration, but they got scrapped. http://www.xbitlabs.com/news/cpu/di...r_Corona_Platform_in_Favour_of_New_Chips.html I suspect that this was the beginning of the end for the Bulldozer family- they had already realised that BD was going to be a failure in servers, and were trying to cut their costs by scrapping the new platform and putting resources into other projects.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
59
91
Knowing more about what actually occurred to result in Bulldozer being what it was would probably help us guess where K12 is headed. If they do ditch CMT I'd think that would mean we see AMD join the SMT club.

I like your optimism regarding the prospects of a deep-dive into lessons learned with the development of bulldozer...but lets be realistic here.

We don't even have detailed documented lessons learned for some of mankind's most notable successes (Apollo program and moon landings, building of the Egyptian pyramids, etc.) when the successive program managers had a resource rich environment within which the luxury of doing such deep-dive documentation projects could have been afforded, and yet none such were embarked.

So what are the odds then of a resource deprived environment the likes of AMD's R&D division post-bulldozer electing to embark upon a deep-dive analysis of what went wrong with bulldozer? I'm guessing those odds are bordering on zero.
 

krumme

Diamond Member
Oct 9, 2009
5,956
1,595
136
Embedded/semi-custom. I think the issue is not whether AMD can get volumes or not, but whether they can generate enough cash inflow in a time frame tight enough to support a bleeding edge pipeline. And I think th answer to this question is no.

The cash inflow is still going to come from the traditional markets. I see 3 factors playing a role here - and my priority would be:

1.
Intel pricing - meaning how much market share do they want AMD to have

2.
Mubadala GF agreement - meaning how do Mubadala value AMD vs GF, thats probably a lot more to do with how they look at their future portfolio vs the political realities and demands

3.
The difference of quality of future products for AMD compared to the current portfolio

Is there others?

As for 1 - i dont think Intel will change their historic strategy here, meaning they will make sure AMD stays weak. And they have the cash for it. Its peanuts vs what they are using of dollar bills to wrap around Atom on the mobile market.

As for 2. - i dont have the slightest idea. Any guess?

As for 3 - Until K12 and derivatives it can only get worse for AMD. So the next 2 years is bad here for sure. And its questionable how K12 can improve that. An opportunity is if Intel is still hurt on the mobile market, and needs more profit, meaning less $ to squeeze AMD.

AMD have a lot of potential tech, but its aparent difficult to make profit from it. There is a high risk. Most is out of their own hands as is see it.
 

krumme

Diamond Member
Oct 9, 2009
5,956
1,595
136
It's a great slide, really, but unfortunately it is one to which AMD's shareholders respond "Awesome! So when will you actually make some money on this whole 'the future is fusion' phenomenon? Cause all we see is that Intel is making all the profits along the way..."

I agree. It reminds of the "true quad core" argument.
The revenue needs to materialize within the next 2 years, or they need to dump that sort of innovative work. HSA is hurting both CPU and GPU perf/mm2/watt. I dont understand they do it so agressively, when they are so small.
 

krumme

Diamond Member
Oct 9, 2009
5,956
1,595
136
I dont know the nuances of the companies financials, but to say that bulldozer[or its successor] was a failure, to me, say that it doesnt work properly and/or never made a profit. I mean it wasnt as good as its competitors but so were arm cpus and the like. just because they arent as fast as intels best, I think doesnt make them failures.

If you look at the diagram of the BD, as eg. first posted by SA, as Poul Demone said; this is meant as an Integer pushing machine at high freq. The obvious intention for the arch was the server market as he said.

Well BD just didnt push the Integers, and the high freq design never materialized. It just doesnt work for the server market it was intended. For me, that means it simply failed. That must be the definition of a failure of a product; it couldnt succed in the goals you had for it.

Well that happens all the time all over. Its part of business and is for all that is competent and taking risk to earn profit. Risk and profit is tied closely together. When you miss: just suck it up, and get on.
 
Aug 11, 2008
10,451
642
126
Fusion makes perfect sense in theory. They are trying to leverage the only area where they have an advantage over intel, and that is igpu performance. Unfortunately for AMD, graphics performance is held back by bandwidth limitations, and their process being behind makes it difficult to add dram like Iris Pro. The problem with HSA is that the software is not there, and it is very hard to drive software development when you have such a small share of the market. Even look at AVX2. Before the bug was found, and after a year on the market, very few applications were developed/modified to use it. And this is with the market share and resources of Intel behind it. Seems it would be much harder for AMD and Fusion. Still, with their process problems and lack of resources, it seems like HSA is the best chance to take back a big share of the market. I will be interesting to see if HSA is still a big emphasis with the K12.

I was a great fan of the purchase of ATI and fusion initially, but one wonders now how things would have developed if the money spent on the acquisition had been used to stay out of debt and work on cpu development and other projects, but done is done and it doesnt really matter.
 

krumme

Diamond Member
Oct 9, 2009
5,956
1,595
136
I hope some university is pursuing a case study with Bulldozer development. There must be some great lessons around why they stuck with that cache structure? Why they chose MHz over IPC even while P4 was burning up? Why didn't they go whole hog in their POWER, SPARC vision and add SMT to their CMT design for maximum threads? What time if any did they spend to determine exactly what sort of instruction patterns are used in deployed software that Intel's Core 2+ CPUs handle much better, were there patent barriers involved in the notable stagnation of CPU performance on said deployed software? Why did they integrate the memory controller with their CPUs ahead of Intel but seem to drag their feet pulling the GPU and chipset features into the CPU?

Knowing more about what actually occurred to result in Bulldozer being what it was would probably help us guess where K12 is headed. If they do ditch CMT I'd think that would mean we see AMD join the SMT club.

From outside it looks like BD is a very complex design - agree?
Its simply beyond my understanding how a management can take on such a task when prior designs was partly brought in from outside, and that the new arch was so complicated.
To me perhaps they lost the sense of reality, and overestimated how much they could do. A normal human reaction - typical in a male culture.
 
Last edited:

Abwx

Lifer
Apr 2, 2011
11,854
4,828
136
If you look at the diagram of the BD, as eg. first posted by SA, as Poul Demone said; this is meant as an Integer pushing machine at high freq. The obvious intention for the arch was the server market as he said.

Well BD just didnt push the Integers, and the high freq design never materialized. It just doesnt work for the server market it was intended. For me, that means it simply failed. That must be the definition of a failure of a product; it couldnt succed in the goals you had for it.

Well that happens all the time all over. Its part of business and is for all that is competent and taking risk to earn profit. Risk and profit is tied closely together. When you miss: just suck it up, and get on.

BD lacks a node shrink to be fully competitive in the server market, that and eventualy the SR cores, all the rest is starting from the consequences, the financials, to guess the causes.

As said they are restricted by the available processes, looking at the numbers a 32nm Vishera is competitive perf/watt wise in MT with a 22nm Intel 4C/4T but is lacking against a 4C/8T, the delta between the two would be largely covered by a node shrink and even noticeably surpassed if Steamroller cores were used instead.
 

krumme

Diamond Member
Oct 9, 2009
5,956
1,595
136
BD lacks a node shrink to be fully competitive in the server market, that and eventualy the SR cores, all the rest is starting from the consequences, the financials, to guess the causes.

As said they are restricted by the available processes, looking at the numbers a 32nm Vishera is competitive perf/watt wise in MT with a 22nm Intel 4C/4T but is lacking against a 4C/8T, the delta between the two would be largely covered by a node shrink and even noticeably surpassed if Steamroller cores were used instead.

There is a reason we dont see PD on eg. 20 nm. We see the consoles using tsmc process. So Mubadala can be reasoned with. If they got proof that a 20nm PD was good business on the very high profit servermarket, i am sure we would have seen it.

There is a reason BD always lack a node shrink. And its tied to the product. Either to much complexity to make it profitable to use or new node or simply because it will not be compettitive either way. Whatever the reason, its because its a failure.

And i maintain its a failure within its own goals - not even counting Intel or processes. The high freq is missing, and so is the massive Integer pushing as we can expect for such an arch.
 

DrMrLordX

Lifer
Apr 27, 2000
22,738
12,721
136
Fusion makes perfect sense in theory. They are trying to leverage the only area where they have an advantage over intel, and that is igpu performance. Unfortunately for AMD, graphics performance is held back by bandwidth limitations, and their process being behind makes it difficult to add dram like Iris Pro. The problem with HSA is that the software is not there, and it is very hard to drive software development when you have such a small share of the market.

Their best hope was (and still is) open source. There are plenty of popular open source software packages out there that they can recode to take advantage of HSA. Those programs may represent a market niche, but owning a niche lock; stock; and barrel is sure better than nothing.

So, the real question is, if Rory put out a memo to hire on some developers and start doing open source project forks geared towards HSA/OpenCL 2.0 optimization, could they do it? Does AMD even have the software tools available in-house to make things work at least on a limited selection of hardware, such as the Asus A88X-Pro using that beta driver?
 

krumme

Diamond Member
Oct 9, 2009
5,956
1,595
136
Their best hope was (and still is) open source. There are plenty of popular open source software packages out there that they can recode to take advantage of HSA. Those programs may represent a market niche, but owning a niche lock; stock; and barrel is sure better than nothing.

So, the real question is, if Rory put out a memo to hire on some developers and start doing open source project forks geared towards HSA/OpenCL 2.0 optimization, could they do it? Does AMD even have the software tools available in-house to make things work at least on a limited selection of hardware, such as the Asus A88X-Pro using that beta driver?

Lock, stock, barrel is super fine and viable - as we see with the consoles. But what niche market are we talking about with HSA and open source?
And what is the aprox projected revenue vs. cost for fatter CPU/GPU parts?
 
Last edited:

DrMrLordX

Lifer
Apr 27, 2000
22,738
12,721
136
Lock, stock, barrel is super fine and viable - as we see with the consoles. But what niche market are we talking about with HSA and open source?
And what is the aprox projected revenue vs. cost for fatter CPU/GPU parts?

Okay, let's say they port Blender to HSA/OpenCL2.0, and the 7850k beats a 4790k handily in Blender benchmarks given those optimizations.

Then someone says, "okay, I'll run it on that 4790k with <insertNvidiaGPUhere> using CUDA". So AMD drops in a cost-equivalent GCN video card with the 7850k and still wins.

(yes, wishful thinking)

Given that hypothetical chain of events, one could then say that AMD owns the Blender userbase. Being able to beat a $300+ Intel chip with a $180 Kaveri on something as non-trivial as a widely-used 3D rendering/animation package would be a stunner. Nobody really seems to care that Kaveri dominates LibreOffice spreedsheet calculation benchmarks or certain LuxMark benches. But Blender would be huge.

That kind of a win would force every group/company that publishes a 3D rendering package to rethink HSA.

Unfortunately, there would be some major problems with such a scenario even if AMD were to do something like that with one or more open source software projects. If a bunch of firms approached AMD and said, "Hey we really love what you did with Blender. Can you give us the support we need to do the same thing with our application?", AMD might be unable to deliver that support.

Also, given the current state of affairs, it might be that the only way such optimized software could take advantage of Kaveri would be for the end-user to use the A88x-Pro with a beta driver. I'm sure not everyone wants to be forced to use the same motherboard. That would certainly be a headache for anyone who has already bought into Kaveri without getting that particular board.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,362
136
There is a reason we dont see PD on eg. 20 nm. We see the consoles using tsmc process. So Mubadala can be reasoned with. If they got proof that a 20nm PD was good business on the very high profit servermarket, i am sure we would have seen it.

There is a reason BD always lack a node shrink. And its tied to the product. Either to much complexity to make it profitable to use or new node or simply because it will not be compettitive either way. Whatever the reason, its because its a failure.

And i maintain its a failure within its own goals - not even counting Intel or processes. The high freq is missing, and so is the massive Integer pushing as we can expect for such an arch.

I shouldnt post those two slides now but just to let you see how competitive 28nm SR can be against 22nm FF Haswell in performance.

Floating Point
2l9o9lj.jpg


Integer
flkrhx.jpg


A10-7700K base frequency is only 3.4GHz, Core i3 4330 is at 3.5GHz. Running those two applications, A10-7700K never goes above 3.5GHz with all four Cores.
It has been said so many times before, BD architecture was created for Throughput and it performs very nice at that. PileDriver and SR has a lot of Throughput performance, what they lack is a better process.
A 4 Module 8-core SR FX CPU at 20nm could be very competitive against 22nm Haswell even in power usage. Unfortunately 20nm High Performance process was not ready when it should for that product.
 

krumme

Diamond Member
Oct 9, 2009
5,956
1,595
136
I shouldnt post those two slides now but just to let you see how competitive 28nm SR can be against 22nm FF Haswell in performance.

Floating Point
2l9o9lj.jpg


Integer
flkrhx.jpg


A10-7700K base frequency is only 3.4GHz, Core i3 4330 is at 3.5GHz. Running those two applications, A10-7700K never goes above 3.5GHz with all four Cores.
It has been said so many times before, BD architecture was created for Throughput and it performs very nice at that. PileDriver and SR has a lot of Throughput performance, what they lack is a better process.
A 4 Module 8-core SR FX CPU at 20nm could be very competitive against 22nm Haswell even in power usage. Unfortunately 20nm High Performance process was not ready when it should for that product.

Those graphs shows perfectly why BD is a failure.

In server context, integer througput is what matters most. Therefore BD is a module with - as in this example - 4 integer units vs the Intel unit with 2 cores. Its effectively quad core in server context - and should be expected to perform as such because diesize matches the 4 cores - and it was designed to be a server part. Well it dont. Not by a long stretch - as it shows even when the front end is given an extensive overhaul.

And btw it shows why Carizo (whatever the reason is for that stupid product) even with a super beefed up dual FPU doesnt matter. Perhaps even to make things worse for diesize/power vs the competing solutions.

Yes K12 cant come soon enough. Keller did a damn impressive job at Apple, that btw nobody but Charlie predicted years back - and results that came unexpected for most of us. Lets hope he can bring some much needed competition. 2016 is comming fast.
 

krumme

Diamond Member
Oct 9, 2009
5,956
1,595
136
Okay, let's say they port Blender to HSA/OpenCL2.0, and the 7850k beats a 4790k handily in Blender benchmarks given those optimizations.

Then someone says, "okay, I'll run it on that 4790k with <insertNvidiaGPUhere> using CUDA". So AMD drops in a cost-equivalent GCN video card with the 7850k and still wins.

(yes, wishful thinking)

Given that hypothetical chain of events, one could then say that AMD owns the Blender userbase. Being able to beat a $300+ Intel chip with a $180 Kaveri on something as non-trivial as a widely-used 3D rendering/animation package would be a stunner. Nobody really seems to care that Kaveri dominates LibreOffice spreedsheet calculation benchmarks or certain LuxMark benches. But Blender would be huge.

That kind of a win would force every group/company that publishes a 3D rendering package to rethink HSA.

Unfortunately, there would be some major problems with such a scenario even if AMD were to do something like that with one or more open source software projects. If a bunch of firms approached AMD and said, "Hey we really love what you did with Blender. Can you give us the support we need to do the same thing with our application?", AMD might be unable to deliver that support.

Also, given the current state of affairs, it might be that the only way such optimized software could take advantage of Kaveri would be for the end-user to use the A88x-Pro with a beta driver. I'm sure not everyone wants to be forced to use the same motherboard. That would certainly be a headache for anyone who has already bought into Kaveri without getting that particular board.

I dont doubt HSA for 3D rendering is so superior its a gamechanger for the market if its used. It will just chew a cpu to pieces.

I just doubt blender is a big enough market with a profit that matters. And if say Blender adopts HSA, what is the chance professional solutions will adopt HSA? - its different segments and you dont move from a professional product to Blender because of HSA.

And then - if HSA is adopted - what is the benefit for AMD vs years and years of starving to get there? Ofcource there is solid benefit if they get there - but damn its a huge risk for the entire company.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,362
136
Those graphs shows perfectly why BD is a failure.

In server context, integer througput is what matters most. Therefore BD is a module with - as in this example - 4 integer units vs the Intel unit with 2 cores. Its effectively quad core in server context - and should be expected to perform as such because diesize matches the 4 cores - and it was designed to be a server part. Well it dont. Not by a long stretch - as it shows even when the front end is given an extensive overhaul.

One BD Module is almost the same die size as Intels Big Core at the same process. I have just show you that at the same clock the SR Module has higher Integer and Float throughput than a Haswell Core. If we had a 20nm SR Module it would be almost the same size, it would had higher Throughput and close the same Power consumption vs 22nm Haswell Core.

So the architecture is not a failure but the end product (CPU) needs a competitive process to directly compete against 22nm and future Intel CPUs at all fronts.
 

krumme

Diamond Member
Oct 9, 2009
5,956
1,595
136
One BD Module is almost the same die size as Intels Big Core at the same process.

Where do you get that from? - thats contrary to my expectations ?

TSMC 20nm its far more comparable to Intel 14nm if we talk density (obviously not perf or cost). I know 90% in these forums think Intel process technology is both as dense as TSMC, far higher performant, and just as cheap. Well i dont asume that kind of dreaming nonsense.

I asume we look at the overall diesize including L3?
 

monstercameron

Diamond Member
Feb 12, 2013
3,818
1
0
Fusion makes perfect sense in theory. They are trying to leverage the only area where they have an advantage over intel, and that is igpu performance. Unfortunately for AMD, graphics performance is held back by bandwidth limitations, and their process being behind makes it difficult to add dram like Iris Pro. The problem with HSA is that the software is not there, and it is very hard to drive software development when you have such a small share of the market. Even look at AVX2. Before the bug was found, and after a year on the market, very few applications were developed/modified to use it. And this is with the market share and resources of Intel behind it. Seems it would be much harder for AMD and Fusion. Still, with their process problems and lack of resources, it seems like HSA is the best chance to take back a big share of the market. I will be interesting to see if HSA is still a big emphasis with the K12.

I was a great fan of the purchase of ATI and fusion initially, but one wonders now how things would have developed if the money spent on the acquisition had been used to stay out of debt and work on cpu development and other projects, but done is done and it doesnt really matter.

it is literally around the corner, read up on java sumatra and c++ amp. Fusion is the future that is why even intel is doing huma style interface and OCL 2.0 in their upcoming products.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,362
136
Where do you get that from? - thats contrary to my expectations ?

TSMC 20nm its far more comparable to Intel 14nm if we talk density (obviously not perf or cost). I know 90% in these forums think Intel process technology is both as dense as TSMC, far higher performant, and just as cheap. Well i dont asume that kind of dreaming nonsense.

I asume we look at the overall diesize including L3?


BD Module at 32nm SOI without 2MB L2 Cache = ~18mm2.
SB Core at 32nm without 2MB L3 Cache is ~18,4mm2

11.jpg


6609784


Haswell Core size without 2MB L3 Cache at 22nm = ~14,5mm2
2013_core_sizes_768.jpg
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
One BD Module is almost the same die size as Intels Big Core at the same process. I have just show you that at the same clock the SR Module has higher Integer and Float throughput than a Haswell Core. If we had a 20nm SR Module it would be almost the same size, it would had higher Throughput and close the same Power consumption vs 22nm Haswell Core.

So the architecture is not a failure but the end product (CPU) needs a competitive process to directly compete against 22nm and future Intel CPUs at all fronts.

Quit with the fairy tales. You have no idea what an AMD CPU manufactured on an Intel process would perform like. Nobody does.

If what you were saying is true, then AMD's current 32nm cores should perform the same and consume the same power as an Intel 32nm core. Guess what, they are lower performing and consume more power than Intel's last generation CPU.
 
Last edited:

AtenRa

Lifer
Feb 2, 2009
14,003
3,362
136
Quit with the fairy tales. You have no idea what an AMD CPU manufactured on an Intel process would perform like. Nobody does.

If what you were saying is true, then AMD's current 32nm cores should perform the same and consume the same power as an Intel 32nm core. Guess what, they are lower performing and consume more power than Intel's last generation CPU.

Actually 32nm BD CPUs have higher throughput than 32nm Intel CPUs.

A few samples, take a look at FX8350 vs Core i7 3820 and 2600K

http://www.guru3d.com/articles_pages/amd_fx_8350_processor_review,1.html
index.php


index.php


index.php


Even in Cinebench
index.php


Faster than Core i7 3820, higher power usage, understandable.
index.php
 

mrmt

Diamond Member
Aug 18, 2012
3,974
0
76
If what you were saying is true, then AMD's current 32nm cores should perform the same and consume the same power as an Intel 32nm core. Guess what, they are lower performing and consume more power than Intel's last generation CPU.

The only rational reason for someone to defend Bulldozer after AMD itself declared it an unmitigated failure is if someone has to make a living selling Bulldozer chips.

Not that a failed product can't be a good deal for consumers, but nevertheless a failure.
 

Vesku

Diamond Member
Aug 25, 2005
3,743
28
86
I like your optimism regarding the prospects of a deep-dive into lessons learned with the development of bulldozer...but lets be realistic here.

We don't even have detailed documented lessons learned for some of mankind's most notable successes (Apollo program and moon landings, building of the Egyptian pyramids, etc.) when the successive program managers had a resource rich environment within which the luxury of doing such deep-dive documentation projects could have been afforded, and yet none such were embarked.

So what are the odds then of a resource deprived environment the likes of AMD's R&D division post-bulldozer electing to embark upon a deep-dive analysis of what went wrong with bulldozer? I'm guessing those odds are bordering on zero.

Which is why I said "University" but your observation rings true. I don't think even things like the successful NASA space program have been fully 'lessons learned' examined in appropriate detail by Universities or otherwise.
 

monstercameron

Diamond Member
Feb 12, 2013
3,818
1
0
The only rational reason for someone to defend Bulldozer after AMD itself declared it an unmitigated failure is if someone has to make a living selling Bulldozer chips.

Not that a failed product can't be a good deal for consumers, but nevertheless a failure.

because failure is relative, every thing has its pros and cons.