AMD to manufacture console APU at GLF

Page 4 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

PPB

Golden Member
Jul 5, 2013
1,118
168
106
CMT was designed for fusion as a streamlined, FPU will always neededed for not so paralel and latency dependant FP workloads. For the rest of the FP workloads the GPU will be there.

That's why I think Fusion is so late, HSA should be in the very first BD-core APU (trinity) to begin with. Considering BD was already massively late, AMD missed their fusion goal by something like 4 years.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
146
106
CMT was designed for fusion as a streamlined, FPU will always neededed for not so paralel and latency dependant FP workloads. For the rest of the FP workloads the GPU will be there.

That's why I think Fusion is so late, HSA should be in the very first BD-core APU (trinity) to begin with. Considering BD was already massively late, AMD missed their fusion goal by something like 4 years.

Are you saying the GPU was to handle all SSE/AVX/FMA loads, both INT and FP? The problem with the GPU is, to make it as useful as the CPUs FPU, you ahev to sacrifice performance.

CMT was a "moar cores" gamble that went terrible wrong. I doubt it had anything to do fusion.

Even Puma isnt listed as HSA. So HSA already looks dead before it ever even took off.

Also in "Fusion" terms, they got beaten timewise there too. The entire ATI buy added yet another failure to its long list. And the lost fabs is another one. Something that ultimately sealed AMDs fate as a VIA contender. And to make things worse, having to depend on GloFo in both processnode and volume.
 
Last edited:

Ajay

Lifer
Jan 8, 2001
16,094
8,114
136
I think NTMBK and AtenRa are correct in that switching back to an improved K10 core in 2009 wouldn't have given AMD the time completely overhaul the architecture in order to create a new base to start on. So by 2009, sadly, neither decision was optimal. The quick port of the BD uArch to 32nm resulted in disappointment (in part because GF's 32nm process wasn't mature enough).

Beating AMD into the ground over a decision made almost a decade ago, doesn't serve much purpose, except as a tech business case study for MBA's.

In 2014 w/SR AMD will likely have the IPC that AMD needed in 2011, that's how far behind they are. To bad AMD was unable to fund a HP node for Kaveri, but that's the way capitalism works - it isn't kind to a series of poor decisions. There are good reasons why Ruiz and Dirk are gone - and why the company is still saddled with cleaning up, as best they can, after those two nearly gutted the company (along with an all to complicit BOD).

At this point, the BD uArch is clearly moving towards beefier CPUs (w/EX) and greater effective decode/issue width (w/SR). That is, AMD's signature x86 core is becoming much more like standard multi-core design, and less like the initial CMT design that was BD.

What remains, are AMD's precarious financial position and their on-going problems with their main Fab partner, Global Foundries. One can only hope that Samsung's team can help GF improve their 28nm node and accelerate the ramp up fast on 20nm.
 

mrmt

Diamond Member
Aug 18, 2012
3,974
0
76
I think NTMBK and AtenRa are correct in that switching back to an improved K10 core in 2009 wouldn't have given AMD the time completely overhaul the architecture in order to create a new base to start on. So by 2009, sadly, neither decision was optimal. The quick port of the BD uArch to 32nm resulted in disappointment (in part because GF's 32nm process wasn't mature enough).

Neither decision was optimal, but scrapping Bulldozer would have allowed a new chip based on Husky by 2013 or a restructuring to other markets in much better terms than what they got now. OTOH going full steam ahead with Bulldozer crashed them in every single market where they fielded the thing.
 

PPB

Golden Member
Jul 5, 2013
1,118
168
106
Are you saying the GPU was to handle all SSE/AVX/FMA loads, both INT and FP? The problem with the GPU is, to make it as useful as the CPUs FPU, you ahev to sacrifice performance.

CMT was a "moar cores" gamble that went terrible wrong. I doubt it had anything to do fusion.

Even Puma isnt listed as HSA. So HSA already looks dead before it ever even took off.

Also in "Fusion" terms, they got beaten timewise there too. The entire ATI buy added yet another failure to its long list. And the lost fabs is another one. Something that ultimately sealed AMDs fate as a VIA contender. And to make things worse, having to depend on GloFo in both processnode and volume.

For the rest of the FP workloads the GPU will be there.


At least if you wanna try troll someone, you should at least learn to read first.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
146
106
At least if you wanna try troll someone, you should at least learn to read first.

Where is the SSE/AVX/FMA executed in AMDs designs? How much can even be processed by the GPU without severe performance penalties? Few 100 apps? If OpenCL and CUDA is an indicator.
 

Ajay

Lifer
Jan 8, 2001
16,094
8,114
136
Are you saying the GPU was to handle all SSE/AVX/FMA loads, both INT and FP? The problem with the GPU is, to make it as useful as the CPUs FPU, you ahev to sacrifice performance.

The problem with GPU doing FP is that it's only good for high bandwidth applications, of which there are not many aside from games. Also, APUs only have dual channel, and inefficient dual channel at that, which cripples that capability anyway.

CMT was a "moar cores" gamble that went terrible wrong. I doubt it had anything to do fusion.
More cores is the way servers are going, so it wasn't a bad idea per se, it was a poor implementation with insufficient ST performance. The 32nm design was rushed and the 32nm GF process was immature. Weak + Weak == Lame.

Even Puma isnt listed as HSA. So HSA already looks dead before it ever even took off.

It doesn't seem like HSA is anything developers are very interesting in spending time on, from what I've read. And developers are everything when it comes to introducing new CPU features. You either win them over, or you don't.

Also in "Fusion" terms, they got beaten timewise there too. The entire ATI buy added yet another failure to its long list. And the lost fabs is another one. Something that ultimately sealed AMDs fate as a VIA contender. And to make things worse, having to depend on GloFo in both processnode and volume.

Yes, AMD's past problems will haunt them for a decade. Becoming an ODM, seems to be the worst possible mistake a "higher performance" CPU designer can make. Not having control over the process node (i.e. no SHP for 28nm) can break even the best designs - and right now AMD doesn't have the best, making their need for a better process node even more critical.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
146
106
Yes, AMD's past problems will haunt them for a decade. Becoming an ODM, seems to be the worst possible mistake a "higher performance" CPU designer can make. Not having control over the process node (i.e. no SHP for 28nm) can break even the best designs - and right now AMD doesn't have the best, making their need for a better process node even more critical.

They actualy sit with a lose/lose situation. While being fabless, they cant even pick the best fabs due to the WSA. And the foundry partner in the WSA is the worst foundry. Over 2 years behind TSMC on 28nm alone.
 

Ajay

Lifer
Jan 8, 2001
16,094
8,114
136
Neither decision was optimal, but scrapping Bulldozer would have allowed a new chip based on Husky by 2013 or a restructuring to other markets in much better terms than what they got now. OTOH going full steam ahead with Bulldozer crashed them in every single market where they fielded the thing.

I think after cancelling 45nm BD, AMD was in panic mode. And, neither the management or BOD was capable of of making a level headed choice at the time. At least this appears to be the case outwardly. I suppose one could argue that a 32nm implementation of Husky w/L3$ may have been just as good as BD, but AMD still would have been in need of a new uArch. If they re-started in 2009, then it might have been due sometime next year. In 2009, very few were expecting the PC market to crash - not even Intel. Intel just have far superior financials and a much deeper engineering bench to call upon (and their own fabs!).
 

SiliconWars

Platinum Member
Dec 29, 2012
2,346
0
0
They actualy sit with a lose/lose situation. While being fabless, they cant even pick the best fabs due to the WSA. And the foundry partner in the WSA is the worst foundry. Over 2 years behind TSMC on 28nm alone.

GF is not "over 2 years behind TSMC" on 28nm, it's less than 1 year. They were also first to HKMG and 32nm well before TSMC was on 28nm.

AMD can also fab chips at any fab they choose, so um yeah.
 

SiliconWars

Platinum Member
Dec 29, 2012
2,346
0
0
Yes, AMD's past problems will haunt them for a decade. Becoming an ODM, seems to be the worst possible mistake a "higher performance" CPU designer can make. Not having control over the process node (i.e. no SHP for 28nm) can break even the best designs - and right now AMD doesn't have the best, making their need for a better process node even more critical.

It was AMD's decision to drop SoI, not GF's.

While it's in vogue to slaughter AMD and GF over the WSA, please remember that Intel is sitting with fabs more than half empty and taking penalties on that. If AMD still had the Dresden fabs they'd be doing that as well.
 

Ajay

Lifer
Jan 8, 2001
16,094
8,114
136
GF is not "over 2 years behind TSMC" on 28nm, it's less than 1 year. They were also first to HKMG and 32nm well before TSMC was on 28nm.

Non risk production started at TSMC (GFX ASICs) ~1Q12. Non risk production for GF (Kaveri) started in 4Q13 - so a bit less than 2 years behind. Rockchip launch on risk production, TTBOMK.

It was AMD's decision to drop SoI, not GF's.

While it's in vogue to slaughter AMD and GF over the WSA, please remember that Intel is sitting with fabs more than half empty and taking penalties on that. If AMD still had the Dresden fabs they'd be doing that as well.

Yes, AMD didn't want to fund R&D for a high performance node. They did this to conserve funds. So I think it was a somewhat forced decision to not fund GF's R&D - not a completely free choice.

I don't find it fashionable to smash AMD because of the WSA. It's a done deal, so it's just a fact that AMD has to deal with. You make a good point that AMD would be dealing with low production at it's fabs (with high fixed costs). I just believe that it's also likely that AMD would be producing a node that was more suitable for a high performance CPU than GF's 28nm LP.

I like AMD, I don't want them to fail. But I don't like seeing baby Zebras getting snatched by Lions either. It's just that I have no control over either situation. It is what it is.

I guess we've beat this topic to death again :\
 

SiliconWars

Platinum Member
Dec 29, 2012
2,346
0
0
Non risk production started at TSMC (GFX ASICs) ~1Q12. Non risk production for GF (Kaveri) started in 4Q13 - so a bit less than 2 years behind. Rockchip launch on risk production, TTBOMK.

I don't think so, it's a different 28nm process to what AMD uses but Rockchip had the RK3188 on show in January this year at CES.

2 different products at least available in March - http://www.youtube.com/watch?v=EoIcrcK3esk and http://www.youtube.com/watch?v=aMccDZG4p_8

It should also be remembered that GF targeted 32nm at the same time TSMC targeted 28nm, because AMD was the only customer at that time.

Yes, AMD didn't want to fund R&D for a high performance node. They did this to conserve funds. So I think it was a somewhat forced decision to not fund GF's R&D - not a completely free choice.

I don't find it fashionable to smash AMD because of the WSA. It's a done deal, so it's just a fact that AMD has to deal with. You make a good point that AMD would be dealing with low production at it's fabs (with high fixed costs). I just believe that it's also likely that AMD would be producing a node that was more suitable for a high performance CPU than GF's 28nm LP.
Kaveri is on 28nm HPP - http://www.globalfoundries.com/technology/28nm.aspx

Better link here - http://www.globalfoundries.com/technology/28HPP.aspx

The only real difference is ditching SOI and going bulk at 28nm, which AMD did for saving R&D (they were paying for SOI R&D at GF) and probably as it makes most sense going forward.

Just pointing out because so many people believe that GF is at fault here, but all of these decisions have been made by AMD. Jaguar could probably have been on that 28 LP process (that Rockchip is using) instead of at TSMC, and that might have saved them the huge take-or-pay penalty they paid last year.
 
Last edited:

mrmt

Diamond Member
Aug 18, 2012
3,974
0
76
More cores is the way servers are going, so it wasn't a bad idea per se, it was a poor implementation with insufficient ST performance.

It's not about moar cores only, but about moat cores with excellent scalability and extremely efficient power management. AMD lacks in all these disciplines. AMD chips cannot scale as well as Intel Core, even at the same node, they lack single threaded performance and they have atrocious power consumption and power management.
 

Ajay

Lifer
Jan 8, 2001
16,094
8,114
136
It's not about moar cores only, but about moat cores with excellent scalability and extremely efficient power management. AMD lacks in all these disciplines. AMD chips cannot scale as well as Intel Core, even at the same node, they lack single threaded performance and they have atrocious power consumption and power management.

There was no scalability problem with BD Opterons, the power problem wasn't as severe at server CPU speeds as it was for high clocked consumer CPUs (though it was still an issue) - if the ST performance was there to match, it wouldn't have been an issue, sadly for AMD, as I already said, ST sucked.

I don't understand why your logic is so brittle in regards to AMD. They are just another CPU company.
 

Ajay

Lifer
Jan 8, 2001
16,094
8,114
136
I don't think so, it's a different 28nm process to what AMD uses but Rockchip had the RK3188 on show in January this year at CES.

2 different products at least available in March - http://www.youtube.com/watch?v=EoIcrcK3esk and http://www.youtube.com/watch?v=aMccDZG4p_8

It should also be remembered that GF targeted 32nm at the same time TSMC targeted 28nm, because AMD was the only customer at that time.

Kaveri is on 28nm HPP - http://www.globalfoundries.com/technology/28nm.aspx

Better link here - http://www.globalfoundries.com/technology/28HPP.aspx

The only real difference is ditching SOI and going bulk at 28nm, which AMD did for saving R&D (they were paying for SOI R&D at GF) and probably as it makes most sense going forward.

Just pointing out because so many people believe that GF is at fault here, but all of these decisions have been made by AMD. Jaguar could probably have been on that 28 LP process (that Rockchip is using) instead of at TSMC, and that might have saved them the huge take-or-pay penalty they paid last year.

Thanks for the links :)
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,362
136
It's not about moar cores only, but about moat cores with excellent scalability and extremely efficient power management. AMD lacks in all these disciplines. AMD chips cannot scale as well as Intel Core, even at the same node, they lack single threaded performance and they have atrocious power consumption and power management.

Every time you open your mouth and say the word "AMD" you FUD/mislead beyond belief.

Dual Intel XEON E5 2660 8 Core 16 Threads 32nm 95W TDP
Dual Opteron 6380 8 Module 16 Cores 32nm 115W TDP

This is the entire Server Power Consumption in LS-DYNA workload.
ps: this is not Throughput workload but single application "Response Time"

http://www.anandtech.com/show/6508/the-new-opteron-6300-finally-tested/11
53019.png
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
146
106
You forgot to list the performance associated with that power consumption didnt you.

mrmt is right.

Why do you think AMD is essentially non existant in the server space today?
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
146
106
From

to 32W difference, its FUD and misleading no matter how you look at it. But he continues his ant-AMD Crusade in every thread, in every sentence he makes.

The performance would have to be identical wouldnt it? Else we could just as well use 32nm Atoms to compare with.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,362
136
The performance would have to be identical wouldnt it? Else we could just as well use 32nm Atoms to compare with.

Now we care about the performance ??? I didnt see you talk about the performance of Vishera when you ware talking about power usage. But you were happy posting just the power consumption graphs :rolleyes:

Even so, the performance difference in that specific application was close to 8%. I dont see any atrocious power consumption that mrmt was talking about.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
146
106
Now we care about the performance ??? I didnt see you talk about the performance of Vishera when you ware talking about power usage. But you were happy posting just the power consumption graphs :rolleyes:

Even so, the performance difference in that specific application was close to 8%. I dont see any atrocious power consumption that mrmt was talking about.

Its you changing the goalpost to make your statement true. If you want to play that game, we could just as well compare Atoms to Opterons in power consumption for the systems.

AMDs server segment marketshare talks for itself. Opterons are terrible.