Why isn't AMD on TSMC?

tamz_msc · Jan 9, 2017

Mod note do not delete:

This was forked off a discussion in the AMD subforum.

AT Moderator ElFenix

Not meaning to derail the thread here, but why couldn't AMD get hold of TSMC for this FinFET generation? Nearly all of NVIDIA's performance gains are due to improvements in silicon, not architecture, which is basically the same as Maxwell.

sirmo · Jan 9, 2017

tamz_msc said:
Not meaning to derail the thread here, but why couldn't AMD get hold of TSMC for this FinFET generation? Nearly all of NVIDIA's performance gains are due to improvements in silicon, not architecture, which is basically the same as Maxwell.

My guess is for IP reuse. They want to use the same graphics IP in APUs and SoCs on 14nm, because it's what Zen will be on. And standardizing on 14nm makes sense since they have the GloFo WSA. The fact that they have amended this agreement to allow them to dual source form both GloFo and Samsung further points to that strategy.

Tapeouts aren't cheap on FinFet, and porting designs to a different process requires man hours as well.

Also I am not so sure higher clocks aren't achievable on 14nm. If we look at Ryzen.. it's having no issues with the clocks so far. I think it's all in the architectural design and using high density libraries, vs. aiming for high f-max. Also Polaris gained about 30% fmax compared to Tonga. Which is not too dissimilar to the gains Nvidia had with Pascal by switching to FinFet

tamz_msc · Jan 9, 2017

Then it seems in retrospect that Polaris was a stop-gap to test the waters at 14nm. Though I've heard Lisa Su claiming that the Polaris to Vega cadence would be much shorter than AMD has ever attempted before, it makes much more sense to have a thorough idea about the node before dropping the big architectural changes.

The problem with this is that it will build unnecessary hype, which have already started with the 4K Doom benchmarks. Polaris delivered exactly what AMD had planned, but the decreased traction that it got after launch, as evidenced by the RX480/470 sales, is more due to the hype surrounding it rather than any actual shortcomings in performance.

Qwertilot · Jan 9, 2017

Not quite a stop gap. Mostly driven the console refresh contract(s) - you have to bear in mind that AMD have been quite constrained in terms of what desktop SKU's they do vs optimal for a bit now. In retrospect it probably dates back to the profound weirdness we saw with Tonga.

As for what they planned, goodness knows, and silly to speculate. Getting the mid term console refresh contracts probably counts as a pass mark though. For more than that they'd have had to get either a chunk more absolute performance, or some more perf/watt (for notebooks.).

CatMerc · Jan 9, 2017

tamz_msc said:
Then it seems in retrospect that Polaris was a stop-gap to test the waters at 14nm. Though I've heard Lisa Su claiming that the Polaris to Vega cadence would be much shorter than AMD has ever attempted before, it makes much more sense to have a thorough idea about the node before dropping the big architectural changes.

The problem with this is that it will build unnecessary hype, which have already started with the 4K Doom benchmarks. Polaris delivered exactly what AMD had planned, but the decreased traction that it got after launch, as evidenced by the RX480/470 sales, is more due to the hype surrounding it rather than any actual shortcomings in performance.

They didn't meet their 2.8x efficiency claims, so they didn't deliver EXACTLY what they promised. If they did, RX 480 would be drawing around 100W.

NTMBK · Jan 9, 2017

sirmo said:
Also I am not so sure higher clocks aren't achievable on 14nm. If we look at Ryzen.. it's having no issues with the clocks so far. I think it's all in the architectural design and using high density libraries, vs. aiming for high f-max. Also Polaris gained about 30% fmax compared to Tonga. Which is not too dissimilar to the gains Nvidia had with Pascal by switching to FinFet

We do have NVidia GPUs as a datapoint. The 1050/1050ti are manufactured on Samsung's 14nm process, and do not clock as high as the rest of the line which is manufactured on TSMC.

SpaceBeer · Jan 9, 2017

Yes, but those are low-end GPUs, most of them made to be within 75W TDP range. This one for example can be OCed more or less the same as TSMC ones (GP106 and larger)
https://www.techpowerup.com/reviews/ASUS/GTX_1050_Ti_Strix_OC/32.html

CatMerc · Jan 9, 2017

SpaceBeer said:
Yes, but those are low-end GPUs, most of them made to be within 75W TDP range. This one for example can be OCed more or less the same as TSMC ones (GP106 and larger)
https://www.techpowerup.com/reviews/ASUS/GTX_1050_Ti_Strix_OC/32.html

It's clocking around 200MHz lower than what you can expect of TSMC made parts.

ariknowsbest · Jan 9, 2017

Wasn't TSMC 16nm used for Xbox One S and PS4 slim/pro as well. But overall TSMC process is likely better because of there experience, but with limited resources it better to avoid WSA payments.

sirmo · Jan 9, 2017

NTMBK said:
We do have NVidia GPUs as a datapoint. The 1050/1050ti are manufactured on Samsung's 14nm process, and do not clock as high as the rest of the line which is manufactured on TSMC.

They clock quite high though still

https://i.imgur.com/I9Lu9tP.png

https://www.computerbase.de/2016-10/geforce-gtx-1050-ti-test/3/

Those are stock clocks for the respective cards. We're talking only about 100Mhz difference (less than 10% compared to 16nm 1060). Also Nvidia may have decided for high density libraries on this process since these are budget cards, but they are achieving much higher clocks, which are mainly due to architecture not the 14nm-16nm process difference.

The process itself might play a small part, but it's a very minor part imo.

This 1050ti overclocks to 1900Mhz

So let's say Nvidia used the similar high fmax libraries for their 14nm GPUs.. the difference is still only about 10-15% between TSMC and Samsung/GloFo. And most of the clock difference comes to the actual architectural difference.

lopri · Jan 10, 2017

Can you imagine how many times in the past ATI and NVIDIA took the credit for their "innovations" that in reality were hard work done by TSMC?

P.S. To the OP's query: Maybe Samsung is cheaper than TSMC?

NTMBK · Jan 10, 2017

sirmo said:
So let's say Nvidia used the similar high fmax libraries for their 14nm GPUs.. the difference is still only about 10-15% between TSMC and Samsung/GloFo. And most of the clock difference comes to the actual architectural difference.

Yeah, 10-15% sounds about right. Enough to make a noticeable difference, but not enough to put GloFo/Samsung out of the running.

iBoMbY · Jan 10, 2017

Despite contrary believes AMD is still bound to the WSA, and the 14LPP process is actually pretty good, and GloFo and AMD still have a close relationship (not only dictated by ownership). If NVidia would produce their Pascals in 14LPP they might actually clock higher than on 16nm from TSMC. If AMD is going to offload anything, the majority will go to Samsung. Some console APUs are produced at TSMC, but this is probably only because the customers wanted this. And the costs for changing things from 14nm to 16nm have already been mentioned.

Also Navi and Zen+ are going to use 7nm FinFET from GloFo/IBM, which will be even more different than 14nm and 16nm, and the majority of AMD products will use this. That doesn't mean there won't be single 10nm/7nm products made from TSMC, or Samsung, but it is getting less likely in the future.

gorobei · Jan 10, 2017

iBoMbY said:
Despite contrary believes AMD is still bound to the WSA, ......

amd is absolutely bound by the WSA.
http://www.anandtech.com/show/10631/amd-amends-globalfoundries-wafer-supply-agreement-through-2020
they have more flexibility to use samsung but it will still cost them. it's better to fab on GF's borrowed 14nm from samsung than pay twice by going to tsmc. they still get experience and know-how on tsmc's process through the console soc's so they can be ready when the WSA expires and they arent saddled with GF.

how many times do you think samsung will share a process generation with GF? once samsung has enough capacity to meet an apple order on their own, there is no need to share with GF. GF is unlikely to get any better at not being way behind on the next gen at 7-10nm. at best they can hope that samsung buys them out.

SpaceBeer · Jan 10, 2017

As I know, until latest change of WSA agreement, AMD had to use GF for its CPUs, but didn't have to use their plants for GPUs. Now they agreed to buy certain number of wafers per year, regardless of product type (CPU/APU or GPU). So in case someone (TSMC, Samsung) has better process, AMD can simply put all "less important" products (low-end, mid-range) in GF, and more important (high-end) in other plants.

But of course, it's not just about raw performance (power, clocks), price is also important. Especially in consumer segment

iBoMbY · Jan 10, 2017

AMD is still required to buy all its CPUs and APUs, and certain amounts of their GPUs from GloFo. All exceptions have to be request by AMD, and agreed on by GloFo. As long as GloFo can deliver the needed wafers, there is no reason for AMD to go elsewhere. 14LPP is good. And as I said, it will stay this way:

Leading-edge technologies like GLOBALFOUNDRIES 7nm FinFET are an important part of how we deliver our long-term roadmap of computing and graphics products that are capable of powering the next generation of computing experiences,” said Dr. Lisa Su, president and CEO, AMD. “We look forward to continuing our close collaboration with GLOBALFOUNDRIES as they extend the solid execution and technology foundation they are building at 14nm to deploy high-performance, low-power 7nm technology in the coming years.

tential · Jan 10, 2017

How do we know Nvidia got all of their performance gains from process improvements by tsmc?
I'm seriously confused by this part of the premise of the thread

tamz_msc · Jan 10, 2017

CatMerc said:
They didn't meet their 2.8x efficiency claims, so they didn't deliver EXACTLY what they promised. If they did, RX 480 would be drawing around 100W.

Well, if you're allowed some technical leeway, then embedded Polaris does deliver a 95W TDP. If your RX 480 is a good sample, you can probably undervolt it and maintain a 100-110W power draw.

tamz_msc · Jan 11, 2017

tential said:
How do we know Nvidia got all of their performance gains from process improvements by tsmc?
I'm seriously confused by this part of the premise of the thread

Because heavily a overclocked 980Ti can get to 90% of a 1080 FE at stock clocks? Because tests like these
https://www.youtube.com/watch?v=nDaekpMBYUA
show that there is a strong reason to believe that process improvements are largely responsible for the improvements in performance? Because AnandTech's own deep dive into Pascal said, I quote-

"In a by-the-numbers comparison then, Pascal does not bring any notable changes in throughput relative to Maxwell. CUDA cores, texture units, PolyMorph Engines, Raster Engines, and ROPs all have identical theoretical throughput-per-clock as compared to Maxwell. So on a clock-for-clock, unit-for-unit basis, Pascal is not any faster on paper. And while NVIDIA does not disclose the size/speed of most of their internal datapaths, so far I haven’t seen anything to suggest that these have radically changed. This continuity means that outside of its new features, GP104 behaves a lot like GM204."

http://www.anandtech.com/show/10325/the-nvidia-geforce-gtx-1080-and-1070-founders-edition-review/4

IllogicalGlory · Jan 11, 2017

TSMC's 16FF+ (FinFET Plus) technology can provide above 65 percent higher speed, around 2 times the density, or 70 percent less power than its 28HPM technology. Comparing with 20SoC technology, 16FF+ provides extra 40% higher speed and 60% power saving. By leveraging the experience of 20SoC technology, TSMC 16FF+ shares the same metal backend process in order to quickly improve yield and demonstrate process maturity for time-to-market value.

I think Pascal uses basic 16FF, so it's only around 50-60% faster at the same power rather than above 65%, but the numbers fall right in line, really.

That being said, most of the benefits on AMD's side seem to be process based as well. They did add some arch features like the primitive discard accelerator, clock gating and color compression, (which Pascal also has) but they didn't seem to close the gap relative to Maxwell in shader efficiency or power efficiency, as they seem (based on TPU data) to still be about as far behind Pascal in that regard as GCN 1.1/1.2 products were behind Maxwell. Perhaps if they were on the same process it would look more favorable and the arch features would be gaining ground for them.

I think AMD could have taken a similar route with the clock frequencies as Pascal did, with a 40% boost in clock speed, but only about a 20% increase in ALUs/mm^2, but instead they went full on with more CUs. Vega seems to be much more similar to Pascal in that respect, with as many shaders as the 610mm^2 Fiji at around 470-480^mm, but with that 40-50% clock speed increase that we see with Pascal (relative to Fiji/Hawaii, etc).

CatMerc · Jan 11, 2017

tamz_msc said:
Well, if you're allowed some technical leeway, then embedded Polaris does deliver a 95W TDP. If your RX 480 is a good sample, you can probably undervolt it and maintain a 100-110W power draw.

Right, but obviously when such claims are made, you expect it for the card they're advertising.
Polaris' very high variability in power draw from sample to sample is quite odd, I've never seen a GPU behave like that. Fmax is not uncommon to have high variability, but power draw?

tamz_msc · Jan 11, 2017

CatMerc said:
Right, but obviously when such claims are made, you expect it for the card they're advertising.
Polaris' very high variability in power draw from sample to sample is quite odd, I've never seen a GPU behave like that. Fmax is not uncommon to have high variability, but power draw?

When GloFo moved to 32nm, there were reports that the FX-8150, the first-gen Bulldozer, had trouble maintaining the advertised clock frequency on some motherboards. So it might be a thing peculiar to GloFo when it transitions to a new node.

Why isn't AMD on TSMC?

Diamond Member

Golden Member

Diamond Member

Golden Member

Golden Member

Lifer

Senior member

Golden Member

Junior Member

Golden Member

Elite Member

Lifer

Member

Diamond Member

Senior member

Member

Diamond Member

Diamond Member

Diamond Member

Senior member

Golden Member

Diamond Member