Foundries and multiple die manufacturing

Status
Not open for further replies.

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
Considering the vast amount of complexity required to manufacture a semiconductor circuit, how do foundries like TSMC deal with different orders from companies and vastly different die sizes? Do they have a "assembly line" for each company, or even each die or is it relatively easy to transition from one to another?

How would "fusing" circuits off work? Or, a better question would be, if there are multiple variants of a common circuit, how are those managed? Say a GPU with 200SPs vs GPU with 400SPs. Rather than disabling 200SPs, there are parts that are seperate die. Wouldn't it be much simpler to make a single die and disable them instead?

I guess the benefits can be tremendous if they can manage having multiple dies rather than fusing them off.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Considering the vast amount of complexity required to manufacture a semiconductor circuit, how do foundries like TSMC deal with different orders from companies and vastly different die sizes? Do they have a "assembly line" for each company, or even each die or is it relatively easy to transition from one to another?

Each customer has their own design. A design is like the blueprints to a house. The blueprints dictate the layout of the rooms, how many floors, whether or not there will be a basement, a back porch, where the garage is located, etc.

What the house is made of - be it bricks, wood, stucco, straw and mud - is what the foundry determines when they create their building codes.

Now obviously the building design must be married to the building construction methods...a straw and mud house is not likely to be capable of supporting a basement and multiple floors. Likewise if all the home designer wants is a single-storied no garage home with as little expense as possible then they aren't going to want it built with granite and marble with greek columns and a parthenon in the backyard.

Now back to semiconductors...each chip's design is represented by a set of masks..each mask is used by the litho tool to print into photoresist the specific circuit layouts needed to eventually form the entire chip.

Except in special circumstances (not production situation) an entire wafer is printed with the same mask set at every level. The mask itself is not a full-wafer in size but it can contain multiple die if the die are small.

TSMC builds everyone's wafers on the same assembly line. Where the wafers get special treatment is in the litho machines, depending on what the wafer is supposed to be (a fermi, a cypress, etc) will determine which mask (they will be called reticles at this point in the process on the floor in the fab) is pulled and used for printing.

There are sub-nodes, your intentionally cold transistors versus your intentionally hot transistors, and these differences are handled at implant plus a few other places (gate ox nitridation, etc). Things get complicated in an N! way when you get into the nitty gritty details.

But essentially the fab could care less what you printed onto the wafers, the customer determines what the layout is on the reticles and how things are handled at test, etc, but production-wise a wafer is a wafer is a wafer.

How would "fusing" circuits off work? Or, a better question would be, if there are multiple variants of a common circuit, how are those managed? Say a GPU with 200SPs vs GPU with 400SPs. Rather than disabling 200SPs, there are parts that are seperate die. Wouldn't it be much simpler to make a single die and disable them instead?

I guess the benefits can be tremendous if they can manage having multiple dies rather than fusing them off.

Understand that each unique die is surrounded by what is called a die-scribe on the wafer. You can't cut apart die into smaller functioning die without cutting about the scribe-seal...the scribe-seal is what keeps the environment out of the chip (and some of the unfriendly stuff inside the chip) so that the chip lives a nice long lifespan. Break that scribe-seal and your chip might operate a month or two before dying from hot carriers, etc, that have diffused into the chip from the environment.

Fusing works very much as the name implies...the current that would ordinarily head into a particular circuit is shunted across a fuse. The fuse being an intentionally conditioned region of polysilicon which is susceptible (in a known and controllable manner) to overheating and literally vaporizing resulting in an open circuit. Just as a fuse works on the macro-level. Fuses are intelligently included in the layout such that circuits can be fused off if they are redundant (extra cache lines) or if they are not needed for the SKU (GTX480, 470, etc).

Not all fuses are designed to "blow" in the same manner. Some are simply designed to result in a resistance shift such that when incorporated in a pull-down or pull-up circuit the resultant resistance shift is enough for the circuit to be toggled on or off. This is things like core unlocking become possible. If AMD really wanted to prevent core-unlocking they would implement fuses that truly blow-open in an unrecoverable way. Not that this could be done overnight...developing a fuse design for a given technology node (and they are node dependent) is just as much an R&D endeavor as developing any other IC component for the node (such as a capacitor or inductor, etc) complete with all the reliability impact studies, yaddi yadda.

The downside to fusing is that fuses themselves take up diespace, they are not 100% reliable (nothing ever is nor can it be) so you can actually lose yield solely due to a bad fuse in the die (they don't usually fuse off the fuses :p) and the stuff you are fusing off takes up die-space too so hopefully you really needed to get rid of it because it cost you a bit of money to create those circuits in the first place.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
and the stuff you are fusing off takes up die-space too so hopefully you really needed to get rid of it because it cost you a bit of money to create those circuits in the first place.

Sorry, I think I made a mistake on my post. "Fusing", where they disable parts of the circuitry which is faulty is commonly used.

But now, the new thing is creating multiple dies. For Radeon HD5xxx, they have the "Cypress" die, "Hemlock" and "Redwood". They are all different dies, and the smaller ones aren't mere fused off circuitry.

Can they really be that flexible? Changing masks in order to support multiple dies must not be complex as it sounds?
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
But now, the new thing is creating multiple dies. For Radeon HD5xxx, they have the "Cypress" die, "Hemlock" and "Redwood". They are all different dies, and the smaller ones aren't mere fused off circuitry.

Can they really be that flexible? Changing masks in order to support multiple dies must not be complex as it sounds?

Yes this is what is meant when people speak to a design being modular.

Each chip has its own set of masks, they aren't interchangeable and each maskset incurs its own fized cost structure (both the production of the masks as well as the design of the layout itself).

This is not done at the wafer-level...what I mean is that a wafer is not partially one chip type and then later converted to another. Rather what you have going on is inventory management. There are entire jobs where the people are solely tasked with the responsibility of determining how many wafer starts of any given device are to be committed each day. Load balancing is done in the fab, inventory out the end of the fab is handled by lot prioritization.

So someone in AMD might decide they want AMD to have 3x more Redwood chips than Cypress chips 60-80 days from now. (typical fab cycle time for a wafer at this node level)

The compute expected die per wafer based on yield projections by device type and back out of that the total number of wafers they need to start today which will be imprinted with the Cypress mask set, and likewise the number of wafers that are going to get the Redwood maskset. That wafer ratio won't be 3:1 as the wafer ratio reflects the expected realities of sellable die per wafer.

The question of flexibility is one of opportunity cost. The budget within which the flexibility is enabled is the cost delta between using fewer designs and fusing off dead areas (but having fewer chips per wafer, so higher costs per chip) versus the costs of designing unique chips with their own masks but getting the opportunity to yield more sellable chips with the reduced functionality per wafer.

Naturally this dynamic evolves over time, and as a node matures and the D0 goes down the opportunity cost for using fused devices versus multiple mask sets goes up. You see/saw AMD making this switch-over in their Regor/Callisto/Rana/etc 45nm Phenom II/AthlonII chips as their 45nm process matured.
 
Status
Not open for further replies.