Global Foundries 32nm Process Status

JFAMD · Jul 26, 2010

Computer Bottleneck said:
After an internet search of "Bulldozer" + "multithreading" I found this:

http://www.xbitlabs.com/news/cpu/di..._Simultaneous_Multi_Threading_Technology.html

Hopefully next month at Hot chips more light will be shed on this interesting issue brought up by Scali.

It has been said many times, many ways. No SMT. Each thread has its own core so it gets full access to its own pipelines.

JFAMD · Jul 26, 2010

Computer Bottleneck said:
If AMD goes with the SMT you are proposing it sounds like their customers would get a licensing discount when using VMware (although I have no idea how significant this is).

http://www.anandtech.com/show/3827/virtualization-ask-the-experts-1

I did the math on some HP servers, a 380, with 2 6-core top bins and a 385 with 2 12-core top bins. There is a pretty significant price delta in the AMD's advantage. Then, when you layer in the VMware license costs, you see that the Intel does have an advantage of ~$150 on a $15-20K configuration.

BUT, the AMD gets the enterprise plus software where the intel has only the enterprise software. So, for $150 you get lots of extra VMware capabilities.

They did a pretty one-sided comparison by saying that there was a big benefit, but really neglected to look at it from a full system level. When you look at the full price for both, the difference is minor at best.

cbn · Jul 26, 2010

JFAMD said:
It has been said many times, many ways. No SMT. Each thread has its own core so it gets full access to its own pipelines.

As a consumer I like how this suggests increased single threaded performance. (crosses fingers)

extra · Jul 26, 2010

Computer Bottleneck said:
As a consumer I like how this suggests increased single threaded performance. (crosses fingers)

God let's hope so eh? I mean I believe it has been said flat out (didn't you write this, JF? maybe my memory is failing me) that not to worry, bulldozer will have better single thread performance than any current AMD performance. Does anyone have a link to that, I think it might have been on semi accurate. Didn't say anything about performance numbers, but it did say it will be better than what is out now (AMD wise anyway). Now, "better" doesn't really tell us much, but at least it means don't panic or something...

Phenom is getting a little old in the tooth imho. It is quite interesting when (on dresdenboy's blog) it seems that ontario (at least those BOINC numbers) isn't really much slower (if at all depending on clock speeds, which we don't know) than llano.

DrMrLordX · Jul 26, 2010

JFAMD said:
It has been said many times, many ways. No SMT. Each thread has its own core so it gets full access to its own pipelines.

Okay, so . . . within reason, can you give us some kind of official confirmation/denial of the issues the OP raised? Honestly I don't even think it will have an impact on Bulldozer/Zambezi's launch window (just my guesstimate); I'm merely curious as to whether or not the packaging issues are real.

Or, to put it another way, if that's something that you can't comment on, you can simply give us the "no comment", and we'll just have to live with it. Silence, however, is no fun.

Also, IF desktop Llano (32nm from GloFo) is delayed, is there any possibility that 40nm bulk Si chips from TSMC will be used as a stopgap until the 32nm chips are ready? And yes, I realize that the 40nm TSMC chips are targeted at the mobile market, and may not be socket-compatible with FM1.

JFAMD · Jul 26, 2010

Computer Bottleneck said:
As a consumer I like how this suggests increased single threaded performance. (crosses fingers)

I can't speak to consumer apps, but on the server side we expect that. 33% more cores, 50% more performance, clearly something beyond adding cores has to be happening.

JFAMD · Jul 26, 2010

DrMrLordX said:
Okay, so . . . within reason, can you give us some kind of official confirmation/denial of the issues the OP raised? Honestly I don't even think it will have an impact on Bulldozer/Zambezi's launch window (just my guesstimate); I'm merely curious as to whether or not the packaging issues are real.

Or, to put it another way, if that's something that you can't comment on, you can simply give us the "no comment", and we'll just have to live with it. Silence, however, is no fun.

Also, IF desktop Llano (32nm from GloFo) is delayed, is there any possibility that 40nm bulk Si chips from TSMC will be used as a stopgap until the 32nm chips are ready? And yes, I realize that the 40nm TSMC chips are targeted at the mobile market, and may not be socket-compatible with FM1.

The problem is that I really can't say anything for 2 reasons: 1.) I don't work at GF so it is a little out of place for me to be speaking about their processes and 2.) this isn't the kind of thing that we would comment on publicly.

I did state earlier that this did not sound like anything I had heard before. I am really skep-tical when people have a lot of detail. In my opinion there are a lot of rumors running around and a lot of speculation. I guess that is just the way the internet works.

40nm bulk and 32nm SOI are really different processes. You can't just move a design. It has nothing to do with socket (that is just packaging, provided that the physical size is the same and the bumps are all in the right places to correspond to the fight pinouts.)

Plus, moving a design from one process to another would probably be a year+ exercise (I am guessing here, and probably lowballing it.)

aphorism · Jul 26, 2010

he doesnt have much details. all he really said was 32nm is using 45nm metal stack for all but one metal layer. i am assuming they are sacrificing performance to stay on schedule by reusing 45nm for half of the process.

busydude · Jul 26, 2010

When you talk about performance issues, are you referring to IPC performance or the overclockability of the chip, or both?

Ben90 · Jul 26, 2010

Actually in this thread we are kinda discussing both, any post involving Bulldozer (BD) is discussing IPC. Any post involving Global Foundries (GF) is talking about the possibly clockspeed/power problems.

BD and GF go hand in hand with each other as it would help BD immensely to have as much of a foundry advantage as possible.

DrMrLordX · Jul 27, 2010

JFAMD said:
The problem is that I really can't say anything for 2 reasons: 1.) I don't work at GF so it is a little out of place for me to be speaking about their processes and 2.) this isn't the kind of thing that we would comment on publicly.

Makes sense. Thanks for addressing the matter.

40nm bulk and 32nm SOI are really different processes. You can't just move a design. It has nothing to do with socket (that is just packaging, provided that the physical size is the same and the bumps are all in the right places to correspond to the fight pinouts.)

I do not mean to suggest that Global Foundries switch to 40nm bulk Si; rather, I would think it potentially prudent to request additional production runs of 40nm mobile Llano chips from TSMC's foundries for use in desktop systems until 32nm Llano from Global Foundries could be produced at expected volumes. I would assume making mobile Llano work on the FM1 platform would simply be a matter of packaging, though there would also be the issue of performance, since mobile Llano is not meant to reach the same clockspeeds. Some binning might correct that problem, though, or at least alleviate the problem.

This is, of course, assuming that mobile Llano will be available before H2 2011.

Scali · Jul 27, 2010

JFAMD said:
They did a pretty one-sided comparison by saying that there was a big benefit, but really neglected to look at it from a full system level. When you look at the full price for both, the difference is minor at best.

Makes sense to me though.
AMD and Intel can control the price of their CPUs, chipsets, motherboards etc, but they have no control over software licenses.
AMD is at a distinct disadvantage here.
Intel can either do what they do now: make a lot more profit on the hardware, because the software license compensates the total cost.
Or they could lower their prices to AMD's level, and offer a cheaper total alternative because of the license advantage, without having to cut into their own profit.

cbn · Jul 27, 2010

JFAMD said:
I did the math on some HP servers, a 380, with 2 6-core top bins and a 385 with 2 12-core top bins. There is a pretty significant price delta in the AMD's advantage. Then, when you layer in the VMware license costs, you see that the Intel does have an advantage of ~$150 on a $15-20K configuration.

BUT, the AMD gets the enterprise plus software where the intel has only the enterprise software. So, for $150 you get lots of extra VMware capabilities.

They did a pretty one-sided comparison by saying that there was a big benefit, but really neglected to look at it from a full system level. When you look at the full price for both, the difference is minor at best.

Thanks JFAMD,

Are there other software licenses (besides VMware) that base fees off number of sockets, cores, threads, etc?

HPC market?

JFAMD · Jul 27, 2010

Servers license in a few different ways:

By server
By named user
By location
By socket
By thread (or "logical processor" which would count HT)
By core

For 4 of those 5 methods, licensing cost is the same for both intel and AMD. There are only 2 major examples of by core licensing, Oracle (but only some apps, not all) and VMware (VSphere 4.0 only, not ESX 3.X, which is most of thier shipments.

I have already showed you that with VSphere the net at the server level is essentially the same when you look at both. That leaves Oracle. But most of Oracle customers are on either a site license or an enterprise license, so core count is not an issue.

The whole licensing issue is a red herring that the competition puts out there to make people think that there are issues with having more cores. The reality is that there are a handful of niches where this might be an issue, but there are really only a few, and it is getting smaller every day because licensing compliance is a huge issue for enterprises and they are moving towards more simplified models that have less exposure.

Martimus · Jul 27, 2010

heyheybooboo said:
My understanding (always questionable ) is the exact opposite: the end result is increased performance and higher density due to reduction in leakage.

How 'hybrid' layers impact the overall chip, I'm clueless. I do take a bit of solace however from the apparent gains at EO 45nm (without knowing how it may 'drop down' to 32nm and 'hybrid' with gate-first). As typical with AMD, it's baby-steps and refine along the way.

For a good overview of gate-first versus gate-last HKMG implementation, here is a good thread to read: http://forums.anandtech.com/showthread.php?t=2031334

Basically, Gate-last integration enables you to customize your gate stack materials specificly for the transistor type (NMOS or PMOS), which enables you to optimize leakage and switching speeds for each transistor independently of the other.

With gate-first integration you must use the exact same HK gate oxide as well as the same gate-stack for both types of transistors...which means you end up selecting gate materials that aren't ideal or optimal for either.

The gate-first implementation is both simpler and cheaper, but it will not perform as well as the gate-last approach.

heyheybooboo said:
What's been missed in the "Liano/1h11" discussion is the Zambezi chip (which I feel folks essentially 'booted' into Q410 - I among them - LOL) and the 'bump' given to Ontario.

(from Anand - 11/11/2009)

As I understand it, Zambezi is BD without the GPU on-die. While folks have fixated on Liano (which is 'Stars' plus a GPU), Zambezi is the first spins or 'proof of concept' which will lead to the actual 'Fusion' (with GPU on-die) in 2012 --- Liano is simply the mid-point, or mini-step, in the process (like Clarksdale with maybe a little AVX thrown in for good measure).

Did I get this right (and did it make sense) ??
--

While I would prefer to stay on the topic of the actual GF process, I can say that your belief that Llano is an MCM GPU and CPU is not likely correct. It has been shown as a integrated on die GPU combined with a modified 'Stars' CPU. It is not MCM like Clarksdale, although it likely is a "proof of concept" as you say. (Although with the GF 32nm problems, it will likely be the second 'Fusion' processor family released by AMD, the first is actually being built on 40nm by TSMC using bobcat cores.)

Right now, there is no Bulldozer model on the roadmap that will have a GPU on-die component. That doesn't mean that we won't see one circa 2012-2013, but right now it has not been announced.

Idontcare · Jul 27, 2010

Martimus said:
For a good overview of gate-first versus gate-last HKMG implementation, here is a good thread to read: http://forums.anandtech.com/showthread.php?t=2031334

Unfortunately a lot of the links in that thread have since gone dead as the publishers closed shop this spring.

If there were any questions left unanswered because of the dead links I would be happy to take a stab at answering them.

GloFo's 32nm HKMG on SOI implementation, the version tailored for AMD's products, does have two significant advantages over Intel's 32nm HKMG node...one obvious advantage is SOI. This "builds in" an intrinsic leakage reduction advantage which we all know and love since the day AMD started using it.

The second advantage, which doubles as a dubious honor, is that the engineering team will have had an extra year (plus a little more) to tweak and optimize the integration before it it was locked down and sent to Dresden for production ramp and qual.

The same sort of benefit that made the difference for AMD being able to use immersion-litho at 45nm versus Intel's situation where their production timeline was simply too aggressive to rely on immersion-litho so they had to opt for double-patterning.

Martimus said:
Right now, there is no Bulldozer model on the roadmap that will have a GPU on-die component. That doesn't mean that we won't see one circa 2012-2013, but right now it has not been announced.

This little factoid is rather interesting, isn't it? For all the "future is fusion" mantra AMD inundates their marketing slides with and yet all they talk about in enterprise/server (the big-margin product space) is bulldozer and an utter void of APU/Fusion SKU's. Is Fusion the future, or is bulldozer the future? And if fusion is the future then why bother buying today's products, or next year's bulldozer products? It is an odd marketing strategy imo.

Imagine if the ATI division had a marketing mantra "the future is holodock" but then went on to say "holodock products will not be available for the next 5-10yrs, in the meantime please purchase our already obsolete products that we are selling, they are obsolete because they can't create a holodock and we've already told you the future is holodock".

I just LOL everytime I see that "future is fusion" moniker on an AMD marketing slide where they are trying to tout the benefits of buying a non-fusion product. I may be the only one that sees the humor in this though.

JFAMD · Jul 27, 2010

The reason that we talk more about servers is that the enterprise is a long lifecycle sale that depends much more on technology decisions that set the pace for the next 3-4 years of purchases. The decisions you make in 2011 about the servers you deploy could impact hundreds or thousands of units through 2014 or so.

In the client world it is much more transactional, so there is not as much that is shared that far out.

It's just a different model.

GaiaHunter · Jul 27, 2010

Idontcare said:
This little factoid is rather interesting, isn't it? For all the "future is fusion" mantra AMD inundates their marketing slides with and yet all they talk about in enterprise/server (the big-margin product space) is bulldozer and an utter void of APU/Fusion SKU's. Is Fusion the future, or is bulldozer the future? And if fusion is the future then why bother buying today's products, or next year's bulldozer products? It is an odd marketing strategy imo.

Clearly the target markets are different - if I'm trying to sell vehicles to the military I might wish to invest in armor able to resist enemy fire, ability to negotiate different landscapes, durability, etc, while if I'm trying to sell a vehicle to a civilian I might wish to focus on other aspects, like fuel efficiency, design, comfort, etc.

Considering the different markets and their different needs, selling more than it is needed/go unused might end in your product being non-competitive.

As the market needs evolve and as the technical aspects of production become more efficient, maybe we will see more fusion products targeted at different markets.

For example, clearly the reason why AMD can't put up a Phenom II+5870 on a single die and charge the same as a Phenom II, is both technical and financial reasons. A product like this would kill a big chunk of Intel market and NVIDIA market.

DrMrLordX · Jul 27, 2010

My understanding is that Fusion was never meant to be exclusively about integration of the GPU into the CPU die. Hypertransport (by way of the HTX slot) was supposed to offer a cheap and ubiquitous interconnect to allow the integration of GPUs into normally CPU-centric computations without the additional latency of an expansion BUS (PCI-e, or what have you).

The other possibility was that AMD would offer GPUs with HT links built-in that were pin-compatible with sockets on multi-socket boards that would drop right into the board, taking the spot that an Opteron might normally take. I think that idea sort of bit the dust when platforms like Quad Father died on the vine.

HT 3.0 supports the HTX standard, and some server/workstation boards have HTX slots, but they are rather under-utilized at the moment.

We see little, if anything, involving HTX intregration into mainstream or enthusiast platforms on their roadmaps, however . . . meaning that Fusion on Zambezi systems almost certainly involves Zambezi + AMD GPUs connected by way of PCI-e.

JFAMD · Jul 27, 2010

Fusion is the merging of the compute model, bringing CPU and GPU together into a unified compute model that allows for both types of processors to divide out the work and deliver the best performance and efficiency.

Putting CPU and GPU into a single piece of silicon is called an APU, which is a tactical way to address fusion. But fusion is not about only a single die, it is about the larger compute model.

We have fusion today on servers based on our FireStream cards. We will continue to have fusion for servers moving forward.

Here are some thoughts on this:

http://blogs.amd.com/work/2010/06/10/fusion-for-servers/
http://blogs.amd.com/work/2010/07/01/fusion-for-servers-happening-today/

DrMrLordX · Jul 27, 2010

JFAMD said:
Putting CPU and GPU into a single piece of silicon is called an APU, which is a tactical way to address fusion. But fusion is not about only a single die, it is about the larger compute model.

Yeah, exactly.

We have fusion today on servers based on our FireStream cards. We will continue to have fusion for servers moving forward.

What kind of plans has AMD made to move FireStream away from PCI-e and towards HTr/HTX?

Also, one thing regarding Fusion about which I am curious is the effect of trying to lump APU-based Fusion, with a relatively low-latency connection between the stream processors and x86 cores combined with high system-memory latency, into the same basket with GPU-based Fusion with its high-latency connection between the CPU and PCI-e-based GPU but low on-card memory latency.

JFAMD · Jul 28, 2010

Can't speak to those issues, you are starting to get into future roadmap areas that we have not disclosed in public.

borisvodofsky · Jul 28, 2010

All this says is Money out of Arab's pockets.. whoooo hoooo.,

All this says is that you have a terrible time keeping to technical matters on a technical forum. Leave your sociopolitical opinions at home.
-ViRGE

Idontcare · Jul 28, 2010

tp4tissue said:
All this says is Money out of Arab's pockets.. whoooo hoooo.,

This mentality has no place here...

heyheybooboo · Jul 28, 2010

Thanks for those fancy schematics, IDC, in the previous gate thread. I tried to run down some of the 'dead' UMC links and failed for the most part.

The only thing I found was this news blurb from December 9th, 2009.

UMC, a leading global semiconductor foundry, presented today at the 2009 International Electron Device Meeting (IEDM) held here, a unique 'hybrid' high-k/metal-gate (HK/MG) technology approach for 28nm. The method combines the benefits of 'gate-first' process strength for nMOS with 'gate-last' features for pMOS to realize up to 30% enhanced transistor performance compared to a gate-first only process.

"The spirit of innovation is always a key factor when developing advanced technologies," said Mr. S.C. Chien, vice president of Advanced Technology Development at UMC. "This published work demonstrates UMC's ability to conceive and develop alternative solution paths, leveraging in-depth learning of existing HK/MG process options to respond to today's rising demand for cutting-edge products and applications."

There are two different HK/MG integration schemes co-existing in the industry, gate-first and gate-last. For gate-first, the HK/MG is inserted before the gate is patterned (formed). For 'gate-last' or 'replacement metal gate', MG is 'filled in' after a polysilicon dummy gate is formed and then removed. In addition to the newly proposed hybrid approach, UMC has been developing gate-last HK/MG technology since gate-last has been proven in high volume within the industry for CPU manufacturing. The company's advanced technology development efforts are taking place at UMC's 300mm Fab and R&D complex in Tainan, Taiwan.

AFAIK, the 'AMD-GloFo-UMC' rumors were squashed a few months ago, and the first Fusion APUs will be TSMC, but UMC 40nm sure does look like it fits with the SoC 'program'.

Martimus said:
...

I can say that your belief that Llano is an MCM GPU and CPU is not likely correct. It has been shown as a integrated on die GPU combined with a modified 'Stars' CPU. It is not MCM like Clarksdale, although it likely is a "proof of concept" as you say.

Right now, there is no Bulldozer model on the roadmap that will have a GPU on-die component. That doesn't mean that we won't see one circa 2012-2013, but right now it has not been announced.

Sorry for not being very clear in my ramblings. I only meant to suggest that 'Liano<-->Clarksdale' was an intermediate step for each AMD/Intel in the process leading to AVX (not necessarily the 'die' integration of the GPU). In other words, Clarksdale will beget SB, and Liano will beget the 'true' (can I use that word?) Fusion while possibly using limited AVX instructions.

--

Global Foundries 32nm Process Status

Senior member

Senior member

Lifer

Golden Member

Lifer

Senior member

Senior member

Member

Diamond Member

Platinum Member

Lifer

Banned

Lifer

Senior member

Diamond Member

Elite Member

Senior member

Diamond Member

Lifer

Senior member

Lifer

Senior member

Diamond Member

Elite Member

Diamond Member