Why GT300 isn't going to make 2009

Page 3 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

magreen

Golden Member
Dec 27, 2006
1,309
1
81
Wow, that's quite a bit of information there. :laugh: are we going to dissect this latest metaphor like the 2x4? I hope keys doesn't say he's going to beat you with one like he said before! :p
 

james1701

Golden Member
Sep 14, 2007
1,791
34
91
IDC, this is one of the most informative threads I have read in a while. I would have never thought when ATI and Nvidia, and others as well, talk about a node shrink, and if they are going to skip the next size, it does not mean a didly thing to try and compare the two different companys based on node size. Thank for all the info.
 

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
Originally posted by: Keysplayr
And yet, what you end up with is a 1 5/8 x 3 1/2 piece of lumber called a 2x4.
What you mentioned could be an exception to the norm. I've never heard of anyone caring about specifically having raw cut lumber anyway. They buy studs, they're called 2x4s, but aren't.
S4S lumber, if that's what the "proper" name for it is the norm nationwide. You go into home depot, that's what you get. A local family lumber yard, that's what you get.
So, the standard "S4S" lumber is what IDC was referring to.

And IDC, I'm going to beat you if you use another lumber analogy!!! :D

Actually Keys I love True 2x4 oak ruff cuts . But today its crazy expensive.

 

dadach

Senior member
Nov 27, 2005
204
0
76
Originally posted by: OCguy
Originally posted by: MarcVenice
Originally posted by: OCguy
If GT300 is released in 2009, I am starting a motion to block links to that website.

Claiming NV was going to have partners hide the 55nm 216s should have been the last straw.

Did you know this is actually 'somewhat' happening? In Holland it's pretty hard to determine wether a GTX260 is an old 65nm or 55nm one. You'd have to go compare it with the AIB's website, if you get that far. On newegg.com, you can see wether its a 192 or 216c version right away.


Sorry, his article didnt say "In Holland it may be somewhat hard to tell for the average consumer."

It implied there would be a system in place to make sure they could clear the 65nm stock. I think EVGA put 55nm on the freaking box.

so? did every single company do that? if they didnt, then its clear that NV tried to clear stocks through that...yesterday i have bought mushkin GTX260 ultimateX...how do i know if it was 55nm or 65nm?
 

yacoub

Golden Member
May 24, 2005
1,991
14
81
Originally posted by: videogames101
Originally posted by: yacoub
Originally posted by: Denithor
Originally posted by: yacoub
It's the same cycle every month: "wait for next month's drivers". No thanks, I'd rather not have the added stress and frustration. I just want my GPU to play the games I bought it to be able to play.

Easy solution: don't worry so much about performance/$ and just buy more card than you really need for a given game/resolution. Even with bad drivers a 4870X2 or GTX295 will cruise through practically anything you can throw at it.

No i mean play them without crashing. Flaky ATi drivers always caused instability in games.

Personally, I have never had a problem with either ATI or Nvidia drivers, and I have a feeling most other poeple haven't either. Unless you're actually going to cite some study about ATI vs. Nvidia aversage game stability, don't think you're personal experience with a few ATI cards is enough to make a blanket claim like that. I hope your not that ignorant.

On topic though, I have a question. If all these node sizes are relative, do companies just keep bullshitting us simply to keep up with competitors? Are amd and intel just claiming too be hitting 32nm for the marketing? I'd really like to know.

What? How is making a statement based on my experience anything other than appropriate for me to do? And how is my statement any more or less valuable than your entire statement that begins with "personally"? It's not. One cannot can make a statement except based on their experience, and mine was consistent - ATi's drivers were never completely stable for me across several systems and several generations. That's just my observation and statement. My statement is not ignorant, it's intelligent. It's drawing a logical conclusion based on first-hand experience. Ignorance would be ignoring that data when buying my next card, especially when others' reports show that ATi's drivers still aren't stable in many games. It's not my fault NVidia's drivers have never given me a problem and thus have encouraged me to continue purchasing their cards (until such time as I run into stability problems with them, which if/when it happens, would put ATi and NVidia back on equal footing for me when looking at a future purchase.)
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Originally posted by: james1701
IDC, this is one of the most informative threads I have read in a while. I would have never thought when ATI and Nvidia, and others as well, talk about a node shrink, and if they are going to skip the next size, it does not mean a didly thing to try and compare the two different companys based on node size. Thank for all the info.

There is a way to properly compare node technologies across company's and within a company but across their product lineup but it really requires you have some rudimentary solid-state device physics understanding so you know the significance of what you are reading/looking at.

Its not impossible to do, but you got to have some training before you can expect yourself to be able to do it.

There are a handful of annual conferences where the engineers of the companies come together and present information relating to the relevant and comparable aspect of their process technology. Things like Idsat, Ioff, metal pitch, sram cell-size. On of those conferences is IEDM, another is IITC.

Here's a somewhat dated article on IEDM from RWT, it provides a nice overview and talking discussion on the metrics that are used to compare technology nodes both within a company as well as across companies.

Specifically this table contains real information that was relevant to analyzing the intrinsic capabilities of process technology nodes from a couple years ago. Just to give you an idea.

And yes it is a pretty fruitless endeavor to try and compare two chips (two different ISA's, two different architectures) like RV740 and GT300 being produced on the same node because we have absolutely zero insight into how the chip designers used the process technology.

Transistors aren't one size, there are two dimensions to them (length and width), a given technology node will have hard limits on the minimum size for the xtor (hence sram density is a useful metric to gauge this limit).

Metal pitches also aren't just one size, the designer has a nearly infinite selection of metal pitches at their disposal when they implement their designs. A process technology will have hard limits to the minimum pitch allowed for any given metal level, and will have a maximum allowed total number of metal levels (in the neighborhood of 10).

So knowing that AMD packed 800+ million xtors into a 137mm^2 die for RV740 really doesn't mean jack about whether NV's engineers need that much space to pack 800+million xtors for the GT300 architecture and ISA, they may be need bigger transistors for clockspeed or leakage reasons and they end up packing just 600m xtors into 137mm^2 or maybe their architecture and ISA lends itself nicely to even smaller transistors (but still within the limits of the process tech's intrinsic capability) and they pack 1B xtors into 137mm^2.

The possibilities are endless, but bounded. Once you know some of the limits of the underlying process tech (sram cell size is a measure of the limits) then you can start to rationalize the boundaries on the xtor density for any company's designs on that process tech, etc.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Originally posted by: videogames101
On topic though, I have a question. If all these node sizes are relative, do companies just keep bullshitting us simply to keep up with competitors? Are amd and intel just claiming too be hitting 32nm for the marketing? I'd really like to know.

Some do, yes. TSMC's relabeling their 45nm node as the 40nm node is an example of BS'ing the investor world for purely marketing reasons. Thilan has a thread in the video forum on this so I won't rehash it here.

The reason no one cares whether a company is BS'ing about their latest and greatest process tech being "32nm node" or "w00t! we just skipped 45nm yall and went straight to 40nm!" is because the node label has zero relevance to the capability.

AMD could call their upcoming 32nm products as 16nm products for all they wanted, it won't change the operating speed or cache density or the cost to manufacture them.

So they aren't about to be stupid and bring on the scrutiny they will get of they decide to skip the 32nm and 22nm node labeles and jump from 45nm to 16nm in their labeling scheme. It would be marketing suicide because Intel would see to it that the world knows what AMD marketing did. Likewise for Intel, they aren't about to give AMD's marketing team the ammo they need to make some really funny commercials.

TSMC did it because no one really cares at this time. No other foundry has 45nm or 40nm qualified, so if TSMC wants to claim they hit 40nm first no one really cares. They just want the chips to start coming out of the fab.
 

magreen

Golden Member
Dec 27, 2006
1,309
1
81
Originally posted by: Idontcare
Originally posted by: videogames101
On topic though, I have a question. If all these node sizes are relative, do companies just keep bullshitting us simply to keep up with competitors? Are amd and intel just claiming too be hitting 32nm for the marketing? I'd really like to know.

Some do, yes. TSMC's relabeling their 45nm node as the 40nm node is an example of BS'ing the investor world for purely marketing reasons. Thilan has a thread in the video forum on this so I won't rehash it here.

The reason no one cares whether a company is BS'ing about their latest and greatest process tech being "32nm node" or "w00t! we just skipped 45nm yall and went straight to 40nm!" is because the node label has zero relevance to the capability.

AMD could call their upcoming 32nm products as 16nm products for all they wanted, it won't change the operating speed or cache density or the cost to manufacture them.

So they aren't about to be stupid and bring on the scrutiny they will get of they decide to skip the 32nm and 22nm node labeles and jump from 45nm to 16nm in their labeling scheme. It would be marketing suicide because Intel would see to it that the world knows what AMD marketing did. Likewise for Intel, they aren't about to give AMD's marketing team the ammo they need to make some really funny commercials.

TSMC did it because no one really cares at this time. No other foundry has 45nm or 40nm qualified, so if TSMC wants to claim they hit 40nm first no one really cares. They just want the chips to start coming out of the fab.

idc, can you give us a rough estimate of the ranges of size of a typical transistor in the 45nm process for intel, amd, and tsmc respectively? I mean, is it anywhere near 45nm x 45nm?

thanks!
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Originally posted by: magreen
idc, can you give us a rough estimate of the ranges of size of a typical transistor in the 45nm process for intel, amd, and tsmc respectively? I mean, is it anywhere near 45nm x 45nm?

thanks!

Basically they focus on the channel length, or Lg as its called in the biz.

This graph from wiki shows some of the correlation between node and minimum xtor length.

The thing is simply knowing the minimum Lg for a company for any given node doesn't really help you (or anyone for that matter) do much with the info.

You might be tempted to assume that knowing the minimum Lg would allow you to make comparisons based on macro-concepts like "how many xtors can I pack into 100mm^2 for Intel vs. AMD" etc. But the work the transistors can get done (switching speed) and the leakage they invoke (Iddq, static, dynamic, etc) and the voltage they require all compound and convolute the efforts.

Things like Ion and Ioff improvements node-to-node are useful to know as they can directly relate to potential performance gains and/or power-consumption reductions.

The closest thing to answering your question would be to discuss sram cell size, which is why everyone likes to trot out their sram test vehicles. Take a look at the data here in this wiki on 45nm Technology demos.

The relative size differences between the varying players for their sram cellsize gives you an idea of just how dense (a measure of tiny) they are able to pack all the necessary elements for a functional circuit like sram.

The utility of this knowledge though is extremely limited because they don't tell you any electrical capabilities of the circuit (sram in this case)...no clockspeeds, voltage, leakage, etc. So TSMC reporting a 0.296um^2 sram cell may seem pretty darn impressive compared to Intel's larger 0.346um^2 sram but a relevant difference is that Intel's sram will clock up to 4GHz whereas you'll be hard pressed to get TSMC's sram over 2GHz.

And none of it speaks to cost...i.e. how manufacturable is the technology? Having 0.296um^2 sram is worthless if it pushes the edge of process control so hard that the parametric yields end up at 5% while the parametric yields for a 0.346um^2 cell gets above 90% (am making up these specific yield numbers just to provide an example).

(by the way there was an excellent thread here a couple years ago by CTho9305 on the reasons why you can't just take a chip design from one company's tech node and produce chips at another company with their tech node even though it is labeled the same, I've long since lost the link to it though but if you pm him I'm sure he can reference you to it)

Am I answering questions faster than I am creating new ones? I hope this isn't increasing the turbidity of the water.
 

OCGuy

Lifer
Jul 12, 2000
27,224
37
91
Im glad to see the knowledge-filled CPU posts of IDC crossing over to Video :thumbsup:
 

magreen

Golden Member
Dec 27, 2006
1,309
1
81
Thanks! That was great. So it seems like in an order-of-magnitude type of way, they are in the ballpark of 50nm.
 

MarcVenice

Moderator Emeritus <br>
Apr 2, 2007
5,664
0
0
Ok, IDC. Great posts, but I'm not leaving you alone just yet.

Can you explain what the development process actually looks like. Nvidia/ATI have their chips manufactured at TSMC for example. They know it's going to be on the 40/45nm node. What does a company need to know from TSMC to develop a chip?

You said you can't take one design from one company's 45nm node to another company's 45nm node. So, how exactly does a certain node influence the design proces. And, although one company's 45nm node can't replace another companys 45nm node, how come Nvidia can do die shrinks from 65nm to 55nm? Wouldn't the same be possible from one company's node to another company's node? Also, this does make switching from TSMC to GlobalFoundy's rather unlikely, does it not?
 

CTho9305

Elite Member
Jul 26, 2000
9,214
1
81
Originally posted by: Idontcare
Originally posted by: videogames101
On topic though, I have a question. If all these node sizes are relative, do companies just keep bullshitting us simply to keep up with competitors? Are amd and intel just claiming too be hitting 32nm for the marketing? I'd really like to know.

Some do, yes. TSMC's relabeling their 45nm node as the 40nm node is an example of BS'ing the investor world for purely marketing reasons. Thilan has a thread in the video forum on this so I won't rehash it here.

The reason no one cares whether a company is BS'ing about their latest and greatest process tech being "32nm node" or "w00t! we just skipped 45nm yall and went straight to 40nm!" is because the node label has zero relevance to the capability.

AMD could call their upcoming 32nm products as 16nm products for all they wanted, it won't change the operating speed or cache density or the cost to manufacture them.

So they aren't about to be stupid and bring on the scrutiny they will get of they decide to skip the 32nm and 22nm node labeles and jump from 45nm to 16nm in their labeling scheme. It would be marketing suicide because Intel would see to it that the world knows what AMD marketing did. Likewise for Intel, they aren't about to give AMD's marketing team the ammo they need to make some really funny commercials.

TSMC did it because no one really cares at this time. No other foundry has 45nm or 40nm qualified, so if TSMC wants to claim they hit 40nm first no one really cares. They just want the chips to start coming out of the fab.

I put more stock in the node name than you do. I think the drawn transistor length usually matches the node name, and for any particular foundry there's a pretty strong correlation between node name and SRAM poly half-pitch. There are many reasons why two processes that can draw the same line pitch would end up with different SRAM cell sizes, and I can go into them if you're interested (a lot of it is explained below).

Originally posted by: Idontcare
Originally posted by: magreen
idc, can you give us a rough estimate of the ranges of size of a typical transistor in the 45nm process for intel, amd, and tsmc respectively? I mean, is it anywhere near 45nm x 45nm?

thanks!

The closest thing to answering your question would be to discuss sram cell size, which is why everyone likes to trot out their sram test vehicles. Take a look at the data here in this wiki on 45nm Technology demos.

The relative size differences between the varying players for their sram cellsize gives you an idea of just how dense (a measure of tiny) they are able to pack all the necessary elements for a functional circuit like sram.

The utility of this knowledge though is extremely limited because they don't tell you any electrical capabilities of the circuit (sram in this case)...no clockspeeds, voltage, leakage, etc. So TSMC reporting a 0.296um^2 sram cell may seem pretty darn impressive compared to Intel's larger 0.346um^2 sram but a relevant difference is that Intel's sram will clock up to 4GHz whereas you'll be hard pressed to get TSMC's sram over 2GHz.

SRAM cells sizes are also poorly correlated to sizes of logic devices.

No transistor is anywhere near 45nm x 45nm. Most transistors on a "45nm" chip will have a drawn length of about 45nm, but the width will range from 100nm to possibly a few microns (thousands of nm). The amount of current you can get through a transistor is roughly proportional to the width divided by the length, and the speed of circuits is strongly affected by how much current flows. This means that you use wider transistors to get higher speed. I think most transistors will be in the range of 4-20 times wider than they are long. At the narrow end of this spectrum the transistors are pretty wimpy, so a 45nm wide transistor would really suck. There are other reasons related to process variation that you wouldn't use narrow transistors (modern process nodes usually control one dimension less accurately that the other, so you make the length very accurate and get some variation of the width... if you have +/-5nm accuracy on a 45nm-wide transistor that's much harder to design with than +/-5nm accuracy on a 450nm-wide transistor).

I should also point out that a transistor has 3 terminals (that get connected to wires), and the "45nm length" refers to the structure attached to one of them (the "gate"). A transistor also has a "source" and a "drain" that need to be as wide as the gate and long enough to fit contacts (probably another >=45nm for each of them). Based on this, the smallest transistor you'd actually use for logic on a 45nm process might be 150nm long and 250nm wide. Note that I haven't checked my arithmetic and these are all very rough numbers.

Originally posted by: MarcVenice
Ok, IDC. Great posts, but I'm not leaving you alone just yet.

Can you explain what the development process actually looks like. Nvidia/ATI have their chips manufactured at TSMC for example. They know it's going to be on the 40/45nm node. What does a company need to know from TSMC to develop a chip?

You said you can't take one design from one company's 45nm node to another company's 45nm node. So, how exactly does a certain node influence the design proces. And, although one company's 45nm node can't replace another companys 45nm node, how come Nvidia can do die shrinks from 65nm to 55nm? Wouldn't the same be possible from one company's node to another company's node? Also, this does make switching from TSMC to GlobalFoundy's rather unlikely, does it not?

I put a copy of the post IDC mentioned on my Wikipedia user page here. I wrote it a couple years ago, and I've learned a lot about CPUs and GPUs since then, so there are pieces I should probably update. Anyway, I've pasted it here for convenience:
Every so often, the question comes up, "If $company1 is having trouble with their ##nm process, why can't they quickly switch to $company2's manufacturing process?" or, "Why does it take so long to move from ##nm to (##*0.7)nm?". Hopefully this writeup will provide some insight into why it's harder than it sounds. I'm going to make generic statements that apply to 90/65/45nm, but not all statements will be true for all 3 of the technology nodes.

*Differences in the actual transistors*

Each manufacturing process' transistors are different. The design space is enormous - not only are there many variables that can be tuned, there are also many different steps that can be used or skipped. I'm only going to write about a few conceptually simple differences and not even try to cover all of the possibilities (as it happens, these simple differences do cover the effects of many more complex differences, so this should still be informative).

Leakage vs. Performance

One of the most straightforward differences is the "threshold voltage" ("Vt" from here on, ideally "Vt"). In theory, the Vt is the voltage you have to put on the gate of a transistor before it turns on (in practice, it's much harder to define, but we'll ignore that here). Transistors aren't really ideal on-off switches though, and if you increase the voltage past the Vt, you'll get more current (i.e. it can drive a bigger load, or switch the same load faster). It's nice to use transistors with low Vts, because they make circuits really fast. However, there's a catch: transistors with lower threshold voltages don't turn off as well (they leak more), and the difference is significant. A gate made from low-Vt ("LVT") transistors might be 30-50% faster than one made from high-Vt ("HVT") transistors, but leak 10 to 100 times as much. Since leakage power is a significant portion of total power nowadays, you can't just use LVT transistors everywhere.

Modern processes generally offer multiple Vts*, so circuits that take longer to evaluate can be built with the fast transistors, but the rest of the design can use slower transistors to save power. However, there isn't a fixed set of threshold voltages across the industry--each process might use a different Vt for the LVT transistors. When porting a design across manufacturing processes, you need to revisit the Vt used in each gate: when moving to a slower process, more gates may need to move to lower threshold voltages; when moving to a faster process, more gates should be switched to higher threshold voltages (to avoid wasting power).

*Or multiple gate lengths, which can be used in a similar way to trade off leakage and performance

Beta ratio

Another difference is in the relative strength of nmos and pmos transistors (nfets and pfets). The ratio of the strengths is referred to as the "beta ratio" (it's lazy terminology if you look at where ß really shows up in the equations...). Silicon inherently lets electrons move more easily than "holes", which makes nfets roughly 2x faster than pfets. I'm not going to go into the physics, mostly because I don't have a good enough grasp to explain it simply :).

Different manufacturers have different tricks to speed up their transistors (stress/strain, crystal orientation, etc) and some of these tricks work better for one type of transistor than the other type. As a result, while one manufacturer may have nfets that are 2x as strong as the pfets, another may have a ratio of 1.5:1 or 2.5:1. This is important, since it affects the sizes of the transistors for optimal delays. I can't remember the math off the top of my head, but for a process with a 2:1 ratio, for optimal delay, pfets should be 1.4x the size of nfets*. When porting from a process with one ratio to a process with a different ratio, you really want to change all the transistor sizes to get maximum benefit.

*If you used pfets 2x the size of nfets, you would get equal rise and fall times. However, that isn't actually fastest option. It turns out that if you shrink the pfet a bit, even though the rising delay gets worse, each gate sees less load capacitance, so the net result is higher performance. I can elaborate further, but that'll require pictures. Let me know if you want the details.

Other differences

If the diffusion capacitance is higher or lower relative to the gate capacitances, different circuit structures may become optimal. For example, on SOI processes, which offer extremely low diffusion cap, pass-gate logic becomes much more appealing.

*DFM (Design for Manufacturability)*

Different manufacturers have different "design rules". Design rules specify what shapes can / can't be reliably manufactured. Notice how the shapes in this photograph are blobs rather than nice rectangles. It's difficult to produce patterns accurately at small scales, so manufacturers have to put limits on things like how closely two shapes can be drawn. Each manufacturer will have different limits.

There are other issues: it's difficult to manufacture very tall and narrow structures - they're being etched as trenches from the top down, but the plasma ions don't go perfectly straight down and you tend to get trapezoidal shapes rather than rectangles (picture). If one manufacturer can construct taller/narrower shapes than another, that affects design decisions.

Real design rule manuals are hundreds of pages, and they're different for each process.

Layout Density

The net result of the design rule differences is that a layout that's legal on one process is likely to be illegal on another process. You can do layout very conservatively so that it works across multiple processes, but that results in bloated cells and wastes a lot of die area (money). It's better to go for denser layout to produce a smaller, cheaper (and possibly faster--due to shorter wire routes) design. Even when staying with the same manufacturer and going to a new process node (e.g. 90nm -> 65nm) the rules change, so different pieces shrink by different amounts, and you have to redo a lot of work to make the new chip closer to optimal.

*Metal stack*

The actual transistors aren't the only things that change across processes: the metal wires change too. Differences in the manufacture of the wires will result in different attributes--different minimum wire widths, different rules regarding current capacity to avoid electromigration, and different resistances and capacitances.

Minimum wire width

If one manufacturer allows narrower wires than another, this actually has a significant effect. While it used to be that transistors were big and most of the time the area of a block was determined by transistor area, nowadays wires can limit the area of a block (if you need to send 100 signals from A to B, you need room for the 100 wires). If you're switching to a process that requires wider wires (relative to the sizes of the transistors), you'll have areas that don't have enough wire tracks available to route all of their signals, and you'll have to bloat the area of that part of the design.

Current capacity rules

The current-carrying capacity of wires is significant because it affects the power supply grid used on the chip. Power grids have to supply some maximum amount of current to the transistors without suffering electromigration, which puts a lower limit on how narrow the wires can be / how many wires can be used (if you have to supply a given amount of current, you can do it with a few really wide wires or a lot of narrow wires; using many narrow wires does have some benefits). There are trade-offs involved, because the power grid steals metal tracks that could have been used to route signals. If you switch to a process that can't carry as much current with the same wire sizes, you'll have to increase the amount of metal dedicated to supplying power--and sacrifice signal routing resources (which might end up costing area).

Resistance and capacitance

Different materials have different resistances and capacitances. Low-k materials reduce capacitance, which lets wires carry signals faster. Each manufacturer is going to have different low-k options (even a single manufacturer may have different choices - for example, some options cost more than others but may offer better performance).

Narrow wires may be strongly affected by "edge effects" (that's what they're called, right?) that increase resistance; they depend on the materials at the interface between the aluminum/copper that makes up the wire and the material that surrounds it.

Differences in wire resistance/capacitance are important because they affect how bad long wires are. If long wires aren't too slow, signals that go long distances can make the trip with only a few "repeaters" along the way (repeaters just buffer a signal along). If wires are slower, signals need to be buffered more frequently, and can't go as far in a cycle. Things like this can actually affect the high-level architecture of a chip (e.g. number of pipeline stages, if your chip is like the Pentium 4 and have to dedicate whole pipeline stages to just routing signals around). Relative to gates, wires are generally slowing down (or not speeding up as much) with each new process generation, so even simple process shrinks require rework as a result.

Wire resistance also affects the power grid--as current flows from the pins through the power grid to the transistors, the resistance results in a voltage drop. The end result is that the transistors see a lower voltage than what's applied at the power pins, so it's important to make sure the power grid's resistance isn't too high. If the new process has higher-resistance wires, you'll need to use up more metal resources for the power grid (otherwise the chip will be slower than a more optimal design would be).

Clock distribution networks tend to be very sensitive to the properties of the metal stack, and to ensure both high-speed and low-power operation the design would have to be reworked when moving to a new process.

*Dealing with the differences*

Most transistors on a chip are parts of "standard cells" (think of them like rubber-stamp patterns that you can plop down wherever you want, and they can be easily swapped for each other...), so if you just fix the transistors in the standard cells (e.g. adjust the beta ratios), you'll take care of a lot of the transistors. However, you'll run into a problem: some of the new transistor sizes won't fit well into the space allocated for the cells on the old manufacturing process. You can bloat all of the standard cells so their relative sizes stay the same and you don't have to redo all of your gate placements, but that wastes a huge amount of space. In the real world, since cost is so important, and cost is strongly affected by die area, you'll want to recoup that wasted space, so you'll have to redo your placement (if some gates shrink or grow relative to other gates, your placement won't be optimal any more). When some signals have to travel longer or shorter distances than they used to, you really want to go back and make sure the gates that were used are still optimal.

*Actually, most are probably in the cache nowadays, but most of those transistors are still part of a repeated cell, the "bit cell", so they effectively need to be fixed once.

There are a lot of fancy circuits in parts of the CPU (caches, register files, extremely-performance-critical logic) that use techniques that are very dependent on specific ratios in the manufacturing process (e.g. transistor leakage vs. current when the transistors are on, and the beta ratio). These circuits have to be redone when moving to a new process (sometimes for functionality if the process is different enough; sometimes just for performance reasons).

Other fancy circuits (PLLs, I/O pads, etc) are also highly sensitive, but they're pretty much voodoo to me so I can't say much about them.

*Other notes*

* It takes a lot longer to manufacture a design (to get first silicon after the design is complete) than many people realize. It takes something like 1-2 months.
* It takes a lot longer to qualify a design than many people realize. It takes a significant chunk of a year. CPUs tend to take longer than ASICs (GPUs, network chips, etc). There's a lot of analysis that has to be done that takes time (for example, verifying reliability of the design--making sure the chips will still work in 5, 10, 20 years--can't be done over night), and a lot of tests have to be run to catch bugs (with a wide variety of hardware and software configurations). Even "small" changes can introduce subtle bugs, and hardware manufacturers are extremely conservative because you can't patch hardware as easily as software (even though there are often hooks that can be used to disable features if they end up being buggy, they may come with significant performance penalties).

Even if porting the design took 1 day, it would still take most of a year before the result was ready to sell; does it still sound like it's worth switching processes rather than fixing your own?

*Why GPUs can port faster than CPUs*

It all comes down to how aggressive the design is. CPUs eek out every last bit of performance, and are highly tuned for a specific manufacturing process. Nearly all of the design is done by hand, which takes a long time but produces slightly better results. The gates in GPUs are almost entirely synthesized automatically by software; this allows for much faster spins of a design, but results in a less-optimal design. They're less aggressive, but as a result they're easier to port. This is probably both a cause and result of shorter design cycles (adding new features quickly helps with marketing - games taking advantage of new DirectX / OpenGL features ship pretty quickly, whereas the x86 instruction set is relatively unchanging in comparison; the design methods used just happen to be conducive to shorter product cycles anyway).

Note that since I wrote that post, I've learned that Intel actually uses large amounts of automated synthesis in CPUs, and nVidia does a lot of hand design in GPUs. This would make Intel CPUs a little easier to port, and nVidia GPUs a little harder to port.

Let me know if you have any other questions or don't understand parts. If you want to go into much more detail, it might make sense to start a new thread (and pm me a link to it).

I'm not speaking for any companies.
 

magreen

Golden Member
Dec 27, 2006
1,309
1
81
Awesome, ctho. You really answered my questions. I'll take some time to mull this over and see if I have any more. Thanks!

The video forum never had it so good.