Question What does TSMC physically produce for AMD? Do they send them wafers or...?

Hulk

Diamond Member
Oct 9, 1999
4,191
1,975
136
is it a product closer in production to a functioning CPU?
Who does validation testing? Both, TSMC for defects and AMD for binning?
I've been curious about this. I assume a lot of it isn't public knowledge.
 

DrMrLordX

Lifer
Apr 27, 2000
21,582
10,785
136
So far as I understand it:

AMD provides the masks to TSMC to etch into wafers, according to TSMC's design rules. TSMC etches the wafers and cuts them into dice, and then supplies the dice to AMD at their packaging and testing facilities. AMD tests the dice and packages them according to their binning guidelines.

Not sure if TSMC does any defect testing, but if I had to guess, I'd say "no".
 

Hulk

Diamond Member
Oct 9, 1999
4,191
1,975
136
Interesting. I'm wondering how the deal between TSMC and AMD is structured? TSMC must have some testing of their product to know what they are delivering? Or to refine the process? Otherwise TSMC can just deliver a lot of product with high defect rate? Meaning if the deal is structured to simply delivered wafers or dice.
 

DrMrLordX

Lifer
Apr 27, 2000
21,582
10,785
136
Interesting. I'm wondering how the deal between TSMC and AMD is structured? TSMC must have some testing of their product to know what they are delivering? Or to refine the process? Otherwise TSMC can just deliver a lot of product with high defect rate? Meaning if the deal is structured to simply delivered wafers or dice.

Customers would probably do some test runs before moving into engineering samples and then qualification samples. Only after that would they do commercial production runs. It also depends on the process.

Also, I'd like to amend what I said previously. Read some of this:


Not sure how that pertains to multi-customer foundries, since AMD may not want to do all their die testing in a TSMC fab (and TSMC may not want them to do all their testing/binning there either).
 
  • Like
Reactions: Hulk

Saylick

Diamond Member
Sep 10, 2012
3,084
6,184
136
Ehh, might as well throw my arm-chair semiconductor engineer hat into the ring as well.

As far as I understand it, TSMC (or any pure play foundry for that matter) gives prospective clients their process design kit, or PDK, which is essentially the ground rules that one must follow when designing a chip using a TSMC process. The PDK also contains parameters that can be used to simulate and model how a potential product might perform using said TSMC process. These parameters are the actual nitty gritty that AMD or Nvidia plug-in to their internal performance models to emulate a given chip. For example, Nvidia owns a few Cadence FPGAs that can be programmed to behave exactly like a theoretical prototype chip. The transistor switching rate, voltage-frequency curves, etc are determined by what TSMC gives them, and if the emulation results look favorable and you feel confident that you're pretty much good to go, then you can then move onto the next step: tape-out, or when you start making masks and then prototype chips. This is basically when you commission TSMC to build you a few chips just to see if your chip design actually works, and more importantly, if your simulations were accurate. You will work with TSMC to identify where issues are popping up that are killing the functionality of the chip, where to improve yields, and where the chip can be made better. Again, using Nvidia as an example, they have in-house labs with electron microscopes where chip analyses can be conducted to determine where prototype chips are having issues. With more and more errors and fixes made, each major version of the die layout is introduced as a "stepping" or "spin", starting from a prototype or initial stepping (e.g. "A0 stepping") and then when you have a revision that's ready to be mass produced, you have a "final stepping" (e.g. "B1 stepping"). Now, since you are in direct contact with TSMC throughout the process and you have actual prototype dies, you will generally have a good idea about where the yield is for your product.

So, if the chip design looks good and the yield looks good, then you pretty much pull the trigger and request a number of wafers that you'd like to buy with TSMC and then TSMC will give you a timeline on how fast they can make your run of chips. Orders are typically on the order of 10s of thousands of wafers per month for CPUs and GPUs, and if we're talking about mobile chips for phones, it can get above a hundred thousand of wafers per month ordered. TSMC pretty much charges you a price per wafer, with more discount applied if you make a larger order.

As for validation and binning, I would imagine that's NOT TSMC's responsibility. I imagine they'll help you sort out yield issues but they won't go through each and every chip for you to determine which ones are good and which one's are bad. TSMC simply just makes them and you get what you get.

EDIT: Fixed some minor typos.
 
Last edited:

moinmoin

Diamond Member
Jun 1, 2017
4,934
7,619
136
Regarding testing for defects: Zen chips have an extensive POST (power-on self test) that is very likely used to weed out dies that are not salvable in any way at the earliest stage. For the chiplets the required controller is on the I/O die though which makes this approach more complicated. So I guess the defect testing is done by AMD as well.
 

Hitman928

Diamond Member
Apr 15, 2012
5,182
7,632
136
So far as I understand it:

AMD provides the masks to TSMC to etch into wafers, according to TSMC's design rules. TSMC etches the wafers and cuts them into dice, and then supplies the dice to AMD at their packaging and testing facilities.

TSMC will produce the mask set but AMD will have ownership over the masks. Basically the cost structure is that the first wafer you order costs the most because it includes the mask set price and then every order after that drops to just the per wafer cost.

Interesting. I'm wondering how the deal between TSMC and AMD is structured? TSMC must have some testing of their product to know what they are delivering? Or to refine the process? Otherwise TSMC can just deliver a lot of product with high defect rate? Meaning if the deal is structured to simply delivered wafers or dice.

A foundry will have done their own wafer runs (over and over again) and characterized both the wafer gradient as well as the defect density before ever opening up the process to customers. That information will also be made available to customers in one form or another. Additionally, foundries will pull wafers being produced (from time to time) to do quality checks to make sure things are running as expected.

Ehh, might as well throw my arm-chair semiconductor engineer hat into the ring as well.

As far as I understand it, TSMC (or any pure play foundry for that matter) gives prospective clients their process design kit, or PDK, which is essentially the ground rules that one must follow when designing a chip using a TSMC process. The PDK also contains parameters that can be used to simulate and model how a potential product might perform using said TSMC process. These parameters are the actual nitty gritty that AMD or Nvidia plug-in to their internal performance models to emulate a given chip. For example, Nvidia owns a few Cadence FPGAs that can be programmed to behave exactly like a theoretical prototype chip. The transistor switching rate, voltage-frequency curves, etc are determined by what TSMC gives them, and if the emulation results look favorable and you feel confident that you're pretty much good to go, then you can then move onto the next step: tape-out, or when you start making masks and then prototype chips. This is basically when you commission TSMC to build you a few chips just to see if your chip design actually works, and more importantly, if your simulations were accurate. You will work with TSMC to identify where issues are popping up that are killing the functionality of the chip, where to improve yields, and where the chip can be made better. Again, using Nvidia as an example, they have in-house labs with electron microscopes where chip analyses can be conducted to determine where prototype chips are having issues. With more and more errors and fixes made, each major version of the die layout is introduced as a "stepping" or "spin", starting from a prototype or initial stepping (e.g. "A0 stepping") and then when you have a revision that's ready to be mass produced, you have a "final stepping" (e.g. "B1 stepping"). Now, since you are in direct contact with TSMC throughout the process and you have actual prototype dies, you will generally have a good idea about where the yield is for your product.

So, if the chip design looks good and the yield looks good, then you pretty much pull the trigger and request a number of wafers that you'd like to buy with TSMC and then TSMC will give you a timeline on how fast they can make your run of chips. Orders are typically on the order of 10s of thousands of wafers per month for CPUs and GPUs, and if we're talking about mobile chips for phones, it can get above a hundred thousand of wafers per month ordered. TSMC pretty much charges you a price per wafer, with more discount applied if you make a larger order.

As for validation and binning, I would imagine that's NOT TSMC's responsibility. I imagine they'll help you sort out yield issues but they won't go through each and every chip for you to determine which ones are good and which one's are bad. TSMC simply just makes them and you get what you get.

EDIT: Fixed some minor typos.

This is pretty good, just a couple of notes. A PDK can contain many things, but there will always be models for simulation as well as documentation for model limits, layout guidelines and rules, EM and thermal info, etc. What type of models are included will depend on the process and the tool set being used. A digital standard cell library is common though GF requires you to work with a 3rd party for this. The standard cell libraries can also come in multiple flavors. Digital design is done largely at a HDL (hardware definition language) level but in the end though, you need to actually lay out your design so you'll have gate level models for standard cells and FET level models for developing custom cells. Then you'll have options for P&R (place and route) or custom routing depending on the balance of speed, density, and man hours you are trying to achieve.

When you run simulations the variability in wafer quality will be taken into account and the designers will run 'corner' simulations to check the design against possible performance variability across the wafer for both PMOS and NMOS quality. That's actually the easier part, the harder part is layout dependent effects. So depending on how the circuit is 'drawn' or laid out, it can change how fast the circuit is able to run, how much current it needs, or bad layout can even cause critical errors like latchup conditions. Anyway, not trying to get too into the weeds with this but yes, first pass success with giant processors like this don't exist anymore and it's usually a few iterations of small batches being made before everything is worked out and a full production run is ordered. As a note, there is no rule or consensus for how to count a stepping, every company has it's own way of doing this though I think the most common generic formula is something that requires a change in the FEOL (the FETs themselves up to maybe the first interconnect layer) is iterated by the starting letter (A, B, C, etc) and a change in only the BEOL, meaning the upper interconnect metal layers) is iterated by the following number (A1, A2, A3, etc). Again, not a rule, just a general idea that people seem to like. A BEOL change is preferable as it is less complicated and cheaper to fix.

TSMC will deliver the bare die to the customer who is then responsible to assemble, package, and test them (or hire someone else to do it). A certain number of dies will have defects or fall below usable performance. Simulations will have been run based upon the TSMC models and documentation for yield and so if something is seriously off in your yield rate then the customer will work with TSMC to determine where the fault lies and refunds or free wafer runs will probably be given to the customer if it was a foundry fault. This is pretty rare but it does happen.

As a side note, TSMC does actually have an assembly and packaging service but I'm pretty sure it's only for small orders, they don't have the resources, or probably the desire, to take on large packaging orders for the big players.
 

joesiv

Member
Mar 21, 2019
75
24
41
This is an interesting topic. So I guess AMD will get wafers, pre-cut from TSMC. Packaged in some way, so that AMD can process/test/validate them, and then place them on the CPU package along with the requisite chiplets. I'd love to see how they package the raw dies for transport to AMD, they'd be small and very fragile I would assume...
 

Hitman928

Diamond Member
Apr 15, 2012
5,182
7,632
136
This is an interesting topic. So I guess AMD will get wafers, pre-cut from TSMC. Packaged in some way, so that AMD can process/test/validate them, and then place them on the CPU package along with the requisite chiplets. I'd love to see how they package the raw dies for transport to AMD, they'd be small and very fragile I would assume...

I don't know about the super high volume products but you can go to https://www.gelpak.com/ to see a common solution for transporting bare die, and yes they are very fragile.
 
  • Like
Reactions: lightmanek

Arkaign

Lifer
Oct 27, 2006
20,736
1,377
126
Production of these modern ICs is so incredibly sensitive that vibration from the earth and ambient radiation are issues that have to be accounted for. Too much smaller and you might even be dealing with quantum oddities on a wider scale, potential glitches due to neutrinos, whatever.
 
  • Like
Reactions: lightmanek