ATI Northern Islands (6xxx) also at 40nm?

Page 3 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

SlowSpyder

Lifer
Jan 12, 2005
17,305
1,002
126
The 5890 will have to be out by the summer for it to matter at all, not a chance in hell anything less than 40nm will be ready by then.

I think the most likely situation is the 6000 series coming out at 40nm, shrinking part or all of it to 32/28 a bit later, and the 7000 being the first to leave 40nm behind. Though they 'might' delay the 6000 a tad, giving us some 5790 like card at the new node first. Given how they seem to love talking about keeping to the schedule that seems less likely.

I wouldn't be shocked if we see a 5890 (or different name) announced sometime around the end of March. I doubt it would be drastically different than the 5870. As we've seen from the 5870 already, with a bit of voltage 1GHz is very achievable. I'm guessing we'll see some tweaks and improvements on the board to support more power/clock speed.
 

tviceman

Diamond Member
Mar 25, 2008
6,734
514
126
www.facebook.com
On the same process, and safely assuming 6x00 series outperforms the 5x00 series tit for tat (6850 > 5850) ATI is looking at a bigger chip. Unless the strategy is to just be smaller than the competition, won't this go against their small die strategy? And even in that case, the size increase from the 4870 to 5870 even with a die shrink was still significant.

I'll be curious to see how ATI tries to maintain their small die strategy, with a next gen architecture on the same process node as their current gen architecture.
 

Lonbjerg

Diamond Member
Dec 6, 2009
4,419
0
0
Well, here's an easy question to answer.
AMD has a ~2 billion transistor chip in the HD58xx cards/

NV are making a ~3billion transistor chip called Fermi on the same process as that ~2bn transistor chip. It uses more power.

That means AMD have a theoretical 50% increase in transistors to play with. However will they manage to improve performance?! They can also increase power consumption for a single chip.

I you totally disregard the fact that cache (a lot of it in Fermi) is made up of a lot of transistors packed in a tight space...compared to a eg. a MUL wich consists of a few transistors on a larger area...oh wait...

I guess reality didn't like you very broad (and flawed) generalization :hmm:
 

lifeblood

Senior member
Oct 17, 2001
999
88
91
Is ATI really going to bypass 32nm? I know TSMC had dropped 32nm, but has GF? Looking at some slides which talked about laptop GPU's they stated 32nm would be used. Are their any whispers about it?

If we consider timeliness, than it would work. ATI could repeat it's previous winning strategy. Bring out the Evergreen refresh on 40nm, except for one which would be on 32nm. Then Northern Isles would come out completely on 32nm.

This would probably be easier than the migration to 40nm was as they already have products and engineers who have worked with 32nm on the CPU side. The only question is will GF have the capacity produce both CPU's and GPU's.

The need to shrink the die is there. Evergreen is good but far from perfect. Their are lots of things they could add that will increase the power requirements and die size, sideport memory for example, so moving to a smaller process is necessary. And even if Fermi is a bomb, ATI cannot rest on it's laurels. AMD rested on it's laurels with K8 and allowed Intel to come back and quickly regain lost ground.
 
Last edited:

Lonbjerg

Diamond Member
Dec 6, 2009
4,419
0
0
From what I gather there is a lot of PR about 28 and 32nm processes (a step climb after Intel started 32nm bulk production) so that they(other FAB's) don't appear tooooooo much behind.

But I want to see them hit 32nm and 28 nm before I take their word on it, some fuzzy PR are not going to cut it.
Intel on the other hands, I have no doubt will hit 22nm on schedule, their excution after tick-tock has been in a leauge of it's own.
 

Kuzi

Senior member
Sep 16, 2007
572
0
0
Is ATI really going to bypass 32nm? I know TSMC had dropped 32nm, but has GF? Looking at some slides which talked about laptop GPU's they stated 32nm would be used. Are their any whispers about it?

GF is working on both 32nm and 28nm. The slides you saw for a laptop GPU probably were about Llano, that is an APU, a CPU+GPU, and AMD will produce that on 32nm because their future "CPUs" will be produced on this process.

If we consider timeliness, than it would work. ATI could repeat it's previous winning strategy. Bring out the Evergreen refresh on 40nm, except for one which would be on 32nm. Then Northern Isles would come out completely on 32nm.

32nm is a full-node, ATI has been using half-node processes for years now. For example, when they moved from 80nm they skipped 65nm (full-node) and went to 55nm. Then again they skipped 45nm which is a full-node, to 40nm. This is why I think their next process shrink will be at 28nm half-node and not 32nm. NVidia has been using half-node for their GPUs for a while too.

The need to shrink the die is there. Evergreen is good but far from perfect. Their are lots of things they could add that will increase the power requirements and die size, sideport memory for example, so moving to a smaller process is necessary.

Sideport memory is added to the Motherboard not the GPU itself, so no die size increase there. Only IGP's need Sideport memory, because discrete cards have their own (faster) video memory.
 

yh125d

Diamond Member
Dec 23, 2006
6,886
0
76
GF is working on both 32nm and 28nm. The slides you saw for a laptop GPU probably were about Llano, that is an APU, a CPU+GPU, and AMD will produce that on 32nm because their future "CPUs" will be produced on this process.



32nm is a full-node, ATI has been using half-node processes for years now. For example, when they moved from 80nm they skipped 65nm (full-node) and went to 55nm. Then again they skipped 45nm which is a full-node, to 40nm. This is why I think their next process shrink will be at 28nm half-node and not 32nm. NVidia has been using half-node for their GPUs for a while too.



Sideport memory is added to the Motherboard not the GPU itself, so no die size increase there. Only IGP's need Sideport memory, because discrete cards have their own (faster) video memory.

Read the RV870 article, this is a different type of sideport
 

Kuzi

Senior member
Sep 16, 2007
572
0
0
Read the RV870 article, this is a different type of sideport

I thought he might have been talking about the sideport feature that was removed from the 4xxx series. But he said "sideport memory", that's where the confusion came from :)
 

Lonyo

Lifer
Aug 10, 2002
21,938
6
81
I you totally disregard the fact that cache (a lot of it in Fermi) is made up of a lot of transistors packed in a tight space...compared to a eg. a MUL wich consists of a few transistors on a larger area...oh wait...

I guess reality didn't like you very broad (and flawed) generalization :hmm:

Well since Fermi is rumoured to be >500mm, and Cypress is 334mm^2 (according to AT), I make that around 50% more area for Fermi and 50% more transistors, despite them being so tightly packed, as you say.

If AT is wrong (some sites seem to say 384mm^2) then that's a 38% bigger die for Fermi (assuming it's the 530mm^2 sites rumour), and the actual difference in transistor counts is ~39% (since Cypress is over 2billion).
Transistor density between the two seems fairly consistent, and AMD may even have them more tightly packed.

So they have 50% more transistors, or 50% more die space before they equal NV's design.
Or it might be 40/40.
So, what was your point?

Also why are you even commenting? The point was that NV are making a larger die with more transistors than AMD on the same process, which means AMD have some headroom to work with.
This thread is about a future product being made on the same process as the current one.
My point was that AMD have headroom because we can already see a larger die being made on the same process. It doesn't matter the specifics of that die, for the most part, just that it's being made.

Basically the fact that NV are already making a larger die means it should be possible to AMD to make a larger die too, thus giving them headroom to improve performance through features (rather than just clocks) despite still being on 40nm.
It has nothing to do with how NV have designed their chip or how much exact room they have, just that they could make a bigger die, quite substantially so in % terms, than they currently have, despite being on the same process.
 
Last edited:

StrangerGuy

Diamond Member
May 9, 2004
8,443
124
106
Who here thinks TSMC 28nm is just unrealistic? They had so much problems with 40nm and considering Intel haven't even shift most of its own supply to 32nm yet.
 

Kuzi

Senior member
Sep 16, 2007
572
0
0
@Lonyo, you are totally right, but some people just don't get it. It is much easier to produce a larger die on a tested and mature process, so ATI could decide to do that for various reasons. Faster time to market, delayed/problematic 32nm/28 process etc.

Who here thinks TSMC 28nm is just unrealistic? They had so much problems with 40nm and considering Intel haven't even shift most of its own supply to 32nm yet.

TSMC 28nm for this year is extremely unrealistic. Maybe Q2 2011 at the earliest.
 
Last edited:

Lonyo

Lifer
Aug 10, 2002
21,938
6
81
Who here thinks TSMC 28nm is just unrealistic? They had so much problems with 40nm and considering Intel haven't even shift most of its own supply to 32nm yet.

http://www.xbitlabs.com/news/cpu/di...e_Nehalem_Micro_Architecture_by_Year_End.html
in Q4 2010 about 35% of desktop Intel CPUs will be manufactured using 32nm fabrication process, whereas 65% of desktop chips will be made at 45nm node.

Hasn't and still won't have by the end of the year.
 

Munky

Diamond Member
Feb 5, 2005
9,372
0
76
Maybe ATI wants better tessellation performance?

GPGPU?

Tessellation performance is a non-issue with current games, and I don't see it being a problem in the near future. I would also blame their poor GPGPU adoption on inferior development tools, not necessarily inferior hardware.
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
For Northern Islands, ATI could insert whatever they had to cut back (cache?), tweak the architecture a bit and/or add a bit more SP's/TMU's (2200SP?/100TMU?). The end result would be a GPU that is ~40% faster than Cypress, the same performance difference between 4890--->5870. And the size of such a GPU @40nm could be around ~460mm^2.

When you say Cypress is only 40% faster than HD4890 do you mean FPS right? (ie, not the actual speed of the video card alone)

How would adding another 600 stream processors and TMUs increases FPS by another 40%? Do you think TMUs are the bottleneck?
 

Lonbjerg

Diamond Member
Dec 6, 2009
4,419
0
0
Well since Fermi is rumoured to be >500mm, and Cypress is 334mm^2 (according to AT), I make that around 50% more area for Fermi and 50% more transistors, despite them being so tightly packed, as you say.

If AT is wrong (some sites seem to say 384mm^2) then that's a 38% bigger die for Fermi (assuming it's the 530mm^2 sites rumour), and the actual difference in transistor counts is ~39% (since Cypress is over 2billion).
Transistor density between the two seems fairly consistent, and AMD may even have them more tightly packed.

So they have 50% more transistors, or 50% more die space before they equal NV's design.
Or it might be 40/40.
So, what was your point?

That your oversimplified "explanation" has no real value...to many unknowns...but if it makes you happy...go ahead.
Just don't present it in anyway as something to be taken as facts.


Also why are you even commenting? The point was that NV are making a larger die with more transistors than AMD on the same process, which means AMD have some headroom to work with.
This thread is about a future product being made on the same process as the current one.
My point was that AMD have headroom because we can already see a larger die being made on the same process. It doesn't matter the specifics of that die, for the most part, just that it's being made.

Again, transistor count is less important than the type of transistors and the architechture...shall I repeat myself..or does it stick this time?


Basically the fact that NV are already making a larger die means it should be possible to AMD to make a larger die too, thus giving them headroom to improve performance through features (rather than just clocks) despite still being on 40nm.
It has nothing to do with how NV have designed their chip or how much exact room they have, just that they could make a bigger die, quite substantially so in % terms, than they currently have, despite being on the same process.

Keep oversimplifying...it's fun reading....but it dosn't hold any real value.

An real world exsample would be the i7 vs. Phenom 2...both 45nm CPU's
i7: 731 million transistors and 263 mm2
Phenom2: 758M million transistors and 258 mm2

In you "world" they should perform the same, hit the same Ghz...and have the same TDP.

See it now?
 

Kuzi

Senior member
Sep 16, 2007
572
0
0
When you say Cypress is only 40% faster than HD4890 do you mean FPS right? (ie, not the actual speed of the video card alone)

Yeah, Cypress performed about 40% faster on average compared to RV790 in games.

How would adding another 600 stream processors and TMUs increases FPS by another 40%? Do you think TMUs are the bottleneck?

I've made a mistake there, it should have been 400 more SP's. So you ask, how can a 25% increase in power, @ 2000SP/100TMU, give a 40% boost?

If for example, we take Juniper (5770), it has 800SP/40TMU, and compare it with a 4870, also 800SP/40TMU, and set the clock speeds and memory bandwidth to be the same, we'll notice that the older 4870 series card is actually 15-20% faster. ATI cut back certain things to get the die sizes of their 5xxx series GPU's down, but in the process lost about 15-20% performance there.

So if for Northern Islands ATI simply puts back what they cut out from the 5xxx cards, they would gain 15%, then add the 25% gain from 400 more SP's and 20 more TMU's, you get ~40% total increase. Of course I'm just giving an example here, but this would not be very hard for ATI to do. And such a GPU would be big (by ATI standards) but not be much larger than 450mm^2.
 
Last edited:

cbn

Lifer
Mar 27, 2009
12,968
221
106
I've made a mistake there, it should have been 400 more SP's. So you ask, how can a 25% increase in power, @ 2000SP/100TMU, give a 40% boost?

If for example, we take Juniper (5770), it has 800SP/40TMU, and compare it with a 4870, also 800SP/40TMU, and set the clock speeds and memory bandwidth to be the same, we'll notice that the older 4870 series card is actually 15-20% faster. ATI cut back certain things to get the die sizes of their 5xxx series cards down, but in the process lost about 15-20% performance there.

The major difference I notice between HD4870 and HD5770 is memory bandwidth.

Why couldn't ATI just add 7 Gbps GDDR5 memory to the high-end HD5870s. That would increase bandwidth 40% and close the gap without having to increase die size (or redesign the Cypress core).
 

tviceman

Diamond Member
Mar 25, 2008
6,734
514
126
www.facebook.com
So if for Northern Islands ATI simply puts back what they cut out from the 5xxx cards, they would gain 15%, then add the 25% gain from 400 more SP's and 20 more TMU's, you get ~40% total increase. Of course I'm just giving an example here, but this would not be very hard for ATI to do. And such a GPU would be big (by ATI standards) but not be much larger than 450mm^2.

It's so easy, even a caveman can do it!
 

HurleyBird

Platinum Member
Apr 22, 2003
2,818
1,553
136
I don't think NI will be on 40nm, it just doesn't fit ATI's profile of aggressively going after the latest manufacturing processes. It also seems to me that Fudzilla has one of the worst track records for a rumor site. I can see 32nm SOI process being doable since they've done much of the leg work with the Llano APU. GF 28nm process might also be possible but less likely in my mind.
 

MrK6

Diamond Member
Aug 9, 2004
4,458
4
81
The major difference I notice between HD4870 and HD5770 is memory bandwidth.

Why couldn't ATI just add 7 Gbps GDDR5 memory to the high-end HD5870s. That would increase bandwidth 40% and close the gap without having to increase die size (or redesign the Cypress core).
The limit isn't the memory frequency, I can tell you that. Raising the frequency, especially in demanding applications, helps very little. As a rough estimate, I'd say there's approximately a 10-20% return on performance (i.e., 10% increase in frequency nets only 1-2% higher framerate).
 

RussianSensation

Elite Member
Sep 5, 2003
19,458
765
126
I don't think NI will be on 40nm, it just doesn't fit ATI's profile of aggressively going after the latest manufacturing processes.

1. As far as I recall reading, HD6000 was developed concurrently with HD5000. Therefore, it may be incorrect to assume that HD6000 is just a continuation of the architecture applied in HD5000. If the architecture is improved, even at 40nm process, you can gain significant performance increases.

2. It is a lot less risky to manufacture HD6000 on 40nm process since 28/32nm process will not be available in large volumes in 2010. Therefore, it again may make sense to introduce a new architecture at 40nm for AMD (but of course they have been working on it for a long time now).

3. The other possibility is that HD6000 will not be released this year; and will only follow in Q1 2011 on the smaller node process.
 

Kuzi

Senior member
Sep 16, 2007
572
0
0
Keep oversimplifying...it's fun reading....but it dosn't hold any real value.

An real world exsample would be the i7 vs. Phenom 2...both 45nm CPU's
i7: 731 million transistors and 263 mm2
Phenom2: 758M million transistors and 258 mm2

In you "world" they should perform the same, hit the same Ghz...and have the same TDP.

See it now?

Last I checked, Phenom II was manufactured at Global Foundries, not at Intel, and using an SOI 45nm process. Intel i7 are made using 45nm HKMG process. Different manufacturers, with contrasting technologies used.

Fermi and Cypress are both made at TSMC, sharing many characteristics of the process itself. So I'd say, the transistors in these GPUs are much more directly comparable than the ones in Intel and AMD CPUs. Even though Fermi and Cypress are completely different architectures.
 

Lonyo

Lifer
Aug 10, 2002
21,938
6
81
That your oversimplified "explanation" has no real value...to many unknowns...but if it makes you happy...go ahead.
Just don't present it in anyway as something to be taken as facts.


Again, transistor count is less important than the type of transistors and the architechture...shall I repeat myself..or does it stick this time?


Keep oversimplifying...it's fun reading....but it dosn't hold any real value.

An real world exsample would be the i7 vs. Phenom 2...both 45nm CPU's
i7: 731 million transistors and 263 mm2
Phenom2: 758M million transistors and 258 mm2

In you "world" they should perform the same, hit the same Ghz...and have the same TDP.

See it now?

Of course I am oversimplifying, because IT DOESN'T MATTER.
The process has the opportunity for more headroom in terms of die size and transistor count. That means AMD can make a bigger GPU on the same process.
End of.
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
The limit isn't the memory frequency, I can tell you that. Raising the frequency, especially in demanding applications, helps very little. As a rough estimate, I'd say there's approximately a 10-20% return on performance (i.e., 10% increase in frequency nets only 1-2% higher framerate).

The major difference between HD5xxx and HD4xxxx is that HD5xxx has some type of error correcting memory.

Apparently instead of the memory clock failing it just keeps on retransmitting signals. This allows the user to increase the memory frequency even more, yet little performance is actually gained. I think the first Anandtech article on Cypress talks about this.

EDIT: Here is the article -->http://www.anandtech.com/video/showdoc.aspx?i=3643&p=12

Therefore using your example we cannot say memory speed isn't a limiting factor.
 
Last edited:

Daedalus685

Golden Member
Nov 12, 2009
1,386
1
0
The major difference between HD5xxx and HD4xxxx is that HD5xxx has some type of error correcting memory.

Apparently instead of the memory clock failing it just keeps on retransmitting signals. This allows the user to increase the memory frequency even more, yet little performance is actually gained. I think the first Anandtech article on Cypress talks about this.

EDIT: Here is the article -->http://www.anandtech.com/video/showdoc.aspx?i=3643&p=12

Therefore using your example we cannot say memory speed isn't a limiting factor.

We can't know for sure, but that should only matter in over clocking.

Anand showed with the 5450 review that, clock for clock, the new architecture is slower than the previous. I would have a very hard time believing the 5450 is worse than the 4550 because of the memory error checking, that should only be noticed when the card is over heating and so on, if it makes a difference at stock there are bigger problems with the chips.

I think the most likely situation is the new shaders are just not as fast as the old ones. With all the investigation around, such as BFG's, memory through put seems to not be the limiting factor (though it might be a factor in some cases)