Discussion Leading Edge Foundry Node advances (TSMC, Samsung Foundry, Intel)

Page 8 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

DisEnchantment

Golden Member
Mar 3, 2017
1,590
5,722
136
TSMC's N7 EUV is now in its second year of production and N5 is contributing to revenue for TSMC this quarter. N3 is scheduled for 2022 and I believe they have a good chance to reach that target.

1587737990547.png
N7 performance is more or less understood.
1587739093721.png

This year and next year TSMC is mainly increasing capacity to meet demands.

For Samsung the nodes are basically the same from 7LPP to 4 LPE, they just add incremental scaling boosters while the bulk of the tech is the same.

Samsung is already shipping 7LPP and will ship 6LPP in H2. Hopefully they fix any issues if at all.
They have two more intermediate nodes in between before going to 3GAE, most likely 5LPE will ship next year but for 4LPE it will probably be back to back with 3GAA since 3GAA is a parallel development with 7LPP enhancements.


1587739615344.png

Samsung's 3GAA will go for HVM in 2022 most likely, similar timeframe to TSMC's N3.
There are major differences in how the transistor will be fabricated due to the GAA but density for sure Samsung will be behind N3.
But there might be advantages for Samsung with regards to power and performance, so it may be better suited for some applications.
But for now we don't know how much of this is true and we can only rely on the marketing material.

This year there should be a lot more available wafers due to lack of demand from Smartphone vendors and increased capacity from TSMC and Samsung.
Lots of SoCs which dont need to be top end will be fabbed with N7 or 7LPP/6LPP instead of N5, so there will be lots of wafers around.

Most of the current 7nm designs are far from the advertized density from TSMC and Samsung. There is still potential for density increase compared to currently shipping products.
N5 is going to be the leading foundry node for the next couple of years.

For a lot of fabless companies out there, the processes and capacity available are quite good.
 

moinmoin

Diamond Member
Jun 1, 2017
4,933
7,619
136
My take is that when it comes to trying new things TSMC is more conservative and Samsung is more aggressive
To customers it probably looks more like the choice between getting work done within a guaranteed time and what often seems like a pure gamble. Though a lot of that is likely down to how the two inform the public about future nodes. TSMC is conservative there, and usually keeps what (seemingly little) it promises, whereas Samsung seems to like to point out how much they push the envelope all the time (not dissimilar to Intel actually), failing to point out that with being in development still everything is in a state of flux.
 

moinmoin

Diamond Member
Jun 1, 2017
4,933
7,619
136
Maybe it's just a worthless rumor, but this doesn't instill confidence in Samsung's node progress.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,590
5,722
136
I am pretty sure it is plain rubbish.
Rumors about the new Exynos are all over the place. On reddit I see the same characters plugging their own website with daily rubbish about the Exynos chip and r/amd seems to lap it up.

Samsung 4LPE will be used for the new Snapdragon. So the news is actually suspect to begin with.
 
  • Like
Reactions: uzzi38

DrMrLordX

Lifer
Apr 27, 2000
21,582
10,785
136
What node does the Exynos 2200 use? 4LPE? I remember the Enynos 2100 and Snapdragon 888 being fairly close to one another in performance (they use the same basic core layout and node), with some minor variations based on cooling. Overall the 2100 ran hotter and throttled more under "normal" circumstances.

It certainly looked like both the 2100 and 888 were "do not buy"s thanks to the limitations of 7LPP and heat issues intrinsic to both SoCs. Hopefully the 2200 and 898 will fare better!
 

DisEnchantment

Golden Member
Mar 3, 2017
1,590
5,722
136

Machine Translation
Samsung Electronics announced that it will mass-produce the 3nm GAA process in the first half of 2022. Which company will become the first customer has recently become the focus of the South Korean industry. According to industry stakeholders, due to the dissatisfaction caused by TSMC’s Apple priority policy, AMD or Qualcomm is likely to become Samsung’s first 3nm customer.
Compared with the alliance between TSMC and Apple, Samsung may also establish new alliances with Supermicro and Qualcomm, and may therefore become the first customer of Samsung's 3nm process. But for such industry rumors, Samsung said it could not disclose any news about its 3nm customers.

AMD or Qualcomm rumored to be first to Samsung's 3GAE.
Qualcomm any way guaranteed to be using 3GAE at some point or basically anything latest from Samsung.
Plus I bet there is discontent having to wait for scraps from what is left over after Apple takes its share.
Others are forced to design with tiny dies, trailing node, worse PPA....
Qualcomm may probably use Intel at some point, definitely for Defence Comm SoCs

To continue the growth rate for which this year would be an astounding 65%, AMD would need tons of wafers much more than what TSMC could ever provide.
Even if they manage only half that growth next year they will start making big dents into Intel's revenue.

While still a remote possibility, AMD seems well positioned in using different nodes with their chiplet tech if yields are garbage for big dies.
What could be challenging would be availability of SoIC equivalents at Samsung, since the rest of the interconnect tech seems self built by AMD within its own supply chain.

That said, TSMC is adding new N5 fab in Arizona for addition 20K wpm, but only in 2024.
And they are dumping 30B USD next year and the next two years after in investments, so definitely capcaity should be better.
Would be loss for TSMC to lose some business from AMD given the high profile wins they had.
 
Last edited:

Ajay

Lifer
Jan 8, 2001
15,332
7,792
136
While still a remote possibility, AMD seems well positioned in using different nodes with their chiplet tech if yields are garbage for big dies.
What could be challenging would be availability of SoIC equivalents at Samsung, since the rest of the interconnect tech seems self built by AMD within its own supply chain.
Monolith dice (APUs)? Or, even, RDNA 'chiplets'. Seems too soon for either though.

That said, TSMC is adding new N5 fab in Arizona for addition 20K wpm, but only in 2024.
And they are dumping 30B USD next year and the next two years after in investments, so definitely capcaity should be better.
Would be loss for TSMC to lose some business from AMD given the high profile wins they had.
I read somewhere that an expansion of an additional 20 wpm was already penciled in for the AZ plant (Phoenix?). Also, 5N in 2024 seems like a poor fit for AMD (except for future IODs).
 

DisEnchantment

Golden Member
Mar 3, 2017
1,590
5,722
136
Monolith dice (APUs)? Or, even, RDNA 'chiplets'. Seems too soon for either though.
Too early to say anything, if TSMC's expansion plans are big enough N5 and N3 will have good capacity. Otherwise Samsung will pick many customers.
Qualcomm anyway will use Samsung, at least a part of their portfolio, like they have been doing for a bit now.

I read somewhere that an expansion of an additional 20 wpm was already penciled in for the AZ plant (Phoenix?). Also, 5N in 2024 seems like a poor fit for AMD (except for future IODs).
Nothing official yet, but is expected, the N5 EUV machines can be retooled to do N3.
 

moinmoin

Diamond Member
Jun 1, 2017
4,933
7,619
136

Machine Translation


AMD or Qualcomm rumored to be first to Samsung's 3GAE.
Qualcomm any way guaranteed to be using 3GAE at some point or basically anything latest from Samsung.
Plus I bet there is discontent having to wait for scraps from what is left over after Apple takes its share.
Others are forced to design with tiny dies, trailing node, worse PPA....
Qualcomm may probably use Intel at some point, definitely for Defence Comm SoCs

To continue the growth rate for which this year would be an astounding 65%, AMD would need tons of wafers much more than what TSMC could ever provide.
Even if they manage only half that growth next year they will start making big dents into Intel's revenue.

While still a remote possibility, AMD seems well positioned in using different nodes with their chiplet tech if yields are garbage for big dies.
What could be challenging would be availability of SoIC equivalents at Samsung, since the rest of the interconnect tech seems self built by AMD within its own supply chain.

That said, TSMC is adding new N5 fab in Arizona for addition 20K wpm, but only in 2024.
And they are dumping 30B USD next year and the next two years after in investments, so definitely capcaity should be better.
Would be loss for TSMC to lose some business from AMD given the high profile wins they had.
Qualcomm seems to be open to cooperating and using every leading edge foundry so them going for 3GAE is like a natural step (if not even a given considering the high likelihood those will then also appear in some Samsung Galaxy products then).

As for AMD I honestly doubt AMD's own products are going to come from Samsung's foundry. Where it gets fuzzy are licensed and semi custom solutions where I can totally see more of RDNA2 and other AMD IPs appear in chips done at Samsung's foundry.

Edit: Where it gets even more fuzzy are the existing collaborations Samsung has with Xilinx (itself pretty close to TSMC as well), like SmartSSD CSD.
 
Last edited:
  • Like
Reactions: Tlh97

DrMrLordX

Lifer
Apr 27, 2000
21,582
10,785
136
AMD is getting a custom variant of N5 (which Apple now covets, allegedly). I think they're being "taken care of" better than some of TSMC's other customers - such as Qualcomm. I don't see them jumping to Samsung.
 

Doug S

Platinum Member
Feb 8, 2020
2,202
3,405
136
Not sure ARM designs can benefit from such frequency optimized device characteristics with higher leakages (e.g. N5 HPC), it not like they can clock that high.

They could, but they'd need to be designed for it. Clock rates are limited by architectural decisions like the size/latency of caches, not by whether a design is ARM or x86.
 

LightningZ71

Golden Member
Mar 10, 2017
1,627
1,898
136
It appears to me that x86 and arm are going to converge from both ends of the spectrum. x86 seems to be gunning for as much performance as they can wring out of a high frequency design and arm seems to be gunning for maximum performance from a wide design. High frequencies are going to be tougher as density increases as it becomes more and more difficult to get heat out of the confines of a very tiny core. Gaining IPC by going wider and deeper also has limits, as you eventually get to the point where you're guessing right 99% of the time, or speculating right with enough pathways 99% of the time and can extract only so much parallelism from the code. Eventually, you have to make fast designs wider, and wide designs faster.

Don't they eventually meet in the middle? Doesn't the middle have a wall? Cache cells are failing to scale down well. It takes more and more cache to keep these designs fed enough to meet their theoretical throughput limits. Eventually, you start to run out of space to put cache, or it gets so large that it takes significant latency hits, or maintaining the width needed for enough throughput to keep prefetching from higher levels happy is not possible.

It just looks like the wall is very close suddenly.
 

moinmoin

Diamond Member
Jun 1, 2017
4,933
7,619
136
They could always move die shrinks of the console chips to Samsung. Frees up capacity at TSMC for other things.
Who is "they"? Sony and Microsoft need to be interested in such a move for it to happen. And like with all (actually) distinct nodes you first need to port the design, create new masks and do all the validation dance. That's very costly. I expect Sony and Microsoft wanting to stay on N7 over the long haul (the plan likely was to stay while everybody else moves on), with maybe a possible move to the partly compatible N6 down the road.

It appears to me that x86 and arm are going to converge from both ends of the spectrum. x86 seems to be gunning for as much performance as they can wring out of a high frequency design and arm seems to be gunning for maximum performance from a wide design. High frequencies are going to be tougher as density increases as it becomes more and more difficult to get heat out of the confines of a very tiny core. Gaining IPC by going wider and deeper also has limits, as you eventually get to the point where you're guessing right 99% of the time, or speculating right with enough pathways 99% of the time and can extract only so much parallelism from the code. Eventually, you have to make fast designs wider, and wide designs faster.

Don't they eventually meet in the middle? Doesn't the middle have a wall? Cache cells are failing to scale down well. It takes more and more cache to keep these designs fed enough to meet their theoretical throughput limits. Eventually, you start to run out of space to put cache, or it gets so large that it takes significant latency hits, or maintaining the width needed for enough throughput to keep prefetching from higher levels happy is not possible.

It just looks like the wall is very close suddenly.
I personally feel they are already closer than many people imagine. The huge difference is ARM's focus on efficiency and x86's focus on high frequency.

The latter as a solitary focus is only really happening for Intel chips anymore, with frequencies and wattage getting pushed even further beyond the sane efficiency inflection point than ever before. On the other hand with Zen AMD already does a mix of both approaches, starting the design from an efficiency point using density focused libraries, but still building designs (literally) leaving sufficient room to allow for high frequencies. (To be frank AMD surprised me with the frequencies they are able to seemingly effortlessly reach on TSMC nodes so far. I was expecting them having more troubles and needing more iterations to achieve what they did.)

But both companies are still ways off from reaching the efficiency of ARM design. AMD appears to be slowly getting there with its APU iterations, IODs likely catching up, and likely being sped up some more with Zen 4c. Intel on the other hand I feel is all over the place currently. Its uncore for a long time had been the benchmark within x86 for power consumption at idle, but progress in that area has stalled for an absurd long time now and is repeatedly undone for the push for high frequencies.

I think the real wall there is the kind of mass market smartphones offer to manufactures of ARM SoCs: In that market efficiency is a fundamental necessity nobody can do without. For x86 having no access to that huge market efficiency is relegated to more of a nice to have feature that can easily be skipped when push comes to shove (Intel being an excellent showcase of that in the last half decade).
 
  • Like
Reactions: Tlh97

NTMBK

Lifer
Nov 14, 2011
10,208
4,940
136
Who is "they"? Sony and Microsoft need to be interested in such a move for it to happen. And like with all (actually) distinct nodes you first need to port the design, create new masks and do all the validation dance. That's very costly. I expect Sony and Microsoft wanting to stay on N7 over the long haul (the plan likely was to stay while everybody else moves on), with maybe a possible move to the partly compatible N6 down the road.

The PS5 and XSX have some serious cooling systems. The PS5 in particular is an absolute behemoth. A die shrink would let them reduce costs on cooling, power supply, case, packaging and shipping.
 
  • Like
Reactions: Tlh97 and Saylick

moinmoin

Diamond Member
Jun 1, 2017
4,933
7,619
136
The PS5 and XSX have some serious cooling systems. The PS5 in particular is an absolute behemoth. A die shrink would let them reduce costs on cooling, power supply, case, packaging and shipping.
PS5 and XSX are commodity products with thin margins (if at all) sold with the goal of selling software afterward. N7 already is an astonishing current node for such products. The previous gen that did get node shrinks stopped at 16FF, and only for upgraded versions. I highly doubt Sony and Microsoft are interested in spending the high upfront cost (that's only increasing with each newer node) just for moving current products off of a current node for which the return of investment likely doesn't look that good yet. Like the past gen there is a chance for both making an upgrade and have that use a newer node from the start. But that's off for some more time.
 

NTMBK

Lifer
Nov 14, 2011
10,208
4,940
136
PS5 and XSX are commodity products with thin margins (if at all) sold with the goal of selling software afterward. N7 already is an astonishing current node for such products. The previous gen that did get node shrinks stopped at 16FF, and only for upgraded versions. I highly doubt Sony and Microsoft are interested in spending the high upfront cost (that's only increasing with each newer node) just for moving current products off of a current node for which the return of investment likely doesn't look that good yet. Like the past gen there is a chance for both making an upgrade and have that use a newer node from the start. But that's off for some more time.

It wasn't just for upgraded consoles- the PS4 "slim" redesign has a 16nm chip, as does the Xbox One S. And the cost of porting the IP blocks to a smaller process can be shared by reusing those blocks for an upgraded PS5 Pro, the same way they did with PS4.

And don't forget about cloud streaming. Microsoft is now running their console chips in their cloud servers, and power consumption and cooling will be the main running cost. A more power efficient die shrink will massively improve their profit margin for streaming.
 
  • Like
Reactions: Tlh97 and Saylick

jpiniero

Lifer
Oct 1, 2010
14,510
5,159
136
It wasn't just for upgraded consoles- the PS4 "slim" redesign has a 16nm chip, as does the Xbox One S. And the cost of porting the IP blocks to a smaller process can be shared by reusing those blocks for an upgraded PS5 Pro, the same way they did with PS4.

The design and wafer costs are getting to the point where it's not going to be worth it to do a straight shrink.
 
  • Like
Reactions: Tlh97 and moinmoin

Cardyak

Member
Sep 12, 2018
72
159
106
It appears to me that x86 and arm are going to converge from both ends of the spectrum. x86 seems to be gunning for as much performance as they can wring out of a high frequency design and arm seems to be gunning for maximum performance from a wide design. High frequencies are going to be tougher as density increases as it becomes more and more difficult to get heat out of the confines of a very tiny core. Gaining IPC by going wider and deeper also has limits, as you eventually get to the point where you're guessing right 99% of the time, or speculating right with enough pathways 99% of the time and can extract only so much parallelism from the code. Eventually, you have to make fast designs wider, and wide designs faster.

Don't they eventually meet in the middle? Doesn't the middle have a wall? Cache cells are failing to scale down well. It takes more and more cache to keep these designs fed enough to meet their theoretical throughput limits. Eventually, you start to run out of space to put cache, or it gets so large that it takes significant latency hits, or maintaining the width needed for enough throughput to keep prefetching from higher levels happy is not possible.

It just looks like the wall is very close suddenly.

You're correct in regards to clock speeds and thermal density, we've been nudging up towards a soft cap around 4 - 5Ghz for a while now, and in the short term this is not going to be resolved.

However, that is exactly why designers now need to go all in on IPC. Apple has proven that it's better to have a smart design with lower clocks instead of vice-versa. Clock speeds are stagnant for the foreseeable future - There is only one way to go.

As for going wider and deeper to extract more IPC having it's limits..... I disagree. Yes it will of course have limitations, but current designs are a long way from this being an issue:

A few points to note:

You stated "...as you eventually get to the point where you're guessing right 99% of the time, or speculating right with enough pathways 99% of the time". The classic mistake being made here is that because 99% accuracy in branch predictors or cache hit rates looks perfect, it creates the illusion there is little improvement left to be gained, but this is incorrect, instead of looking at the success rate, instead you should look at the failure rate.

For example: let's take a Branch Predictor A, and say it has a prediction accuracy of 99%, so therefore the mis-predict rate is ~1%, now compare it to Branch Predictor B which has a prediction accuracy of 99.8%, so therefore the mis-predict rate is ~0.2%. The accuracy has only increased from 99% to 99.8%, however the mis-predict rate has fallen from 1% to 0.2%, that's a five-fold reduction in mis-predictions! Essentially, just because we are nearing very high numbers for prediction accuracy and cache hit rates does not mean we are nearing the end of this as a lever of IPC growth, there's still a ton more improvements to be made. See this paper by Intel for more details: https://arxiv.org/pdf/1906.08170.pdf

Your other point - "...and can extract only so much parallelism from the code" is also pessimistic. Currently most high-performance designs execute between 4 - 5 instructions per clock cycle, many limitation studies have proven that with a large enough OoO window coupled with an accurate enough Branch predictor and large caches can push IPC up to the mid to high 20's. Not to mention this is all whilst remaining limited by data dependencies. It is possible to break these data dependencies to expose even more parallelism, and push IPC even higher. Two such ideas that have been explored in literature for a long time are Value Prediction and Selective Replay.

So in conclusion, clock speeds are cut off as an avenue of performance improvement for the near to medium future, but IPC increases have the potential to keep increasing for a long time.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,590
5,722
136
That's assuming AMD is chasing high leakage. Which isn't necessarily the case.
That is assuming all device characteristics come for free. Which isn't necessarily the case.
Get all effciency and density and performance and superb standby power you're saying?
TSMC says leakage is more for N5 HPC
1637318895272.png
 
  • Like
Reactions: uzzi38

DrMrLordX

Lifer
Apr 27, 2000
21,582
10,785
136
That is assuming all device characteristics come for free. Which isn't necessarily the case.
Get all effciency and density and performance and superb standby power you're saying?
TSMC says leakage is more for N5 HPC

AMD isn't using HPC libraries on N7. What makes you think they'll switch to it for N5?