ATI 5870 specs?

TC91

Golden Member
Jul 9, 2007
1,164
0
0
I really really think the RV870 is going to have 2000 shaders, since if there was only 1200, it probably would be R600 vs. G80 all over again.
 

thilanliyan

Lifer
Jun 21, 2005
11,871
2,076
126
Well they didn't just increase shaders...they increased ROPs (and TMUs?) as well I think. Twice as fast as a 4890 would be a good target but I don't know if they'll get that far.
 

dguy6789

Diamond Member
Dec 9, 2002
8,558
3
76
Originally posted by: TC91
I really really think the RV870 is going to have 2000 shaders, since if there was only 1200, it probably would be R600 vs. G80 all over again.

Cards haven't increased by two fold every time engine wise every generation. That's actually pretty rare these days. The GT200 series does not have twice the complexity of the G80. The Geforce 7800 and 7900 cards had 24 pixel pipelines up from 16, vertex shaders up to 8 from 6, coupled with an increase in clock speed to reach their performance goals. Memory bus was still 256 bit, ROPs were still 16, just like the 6800s.
 

Keysplayr

Elite Member
Jan 16, 2003
21,209
50
91
I can almost guaranty that these specs are intended to mislead. Remember when R7xx specs leaked? 480 sp's ? Turned up with 800.

R8xx will show up with no less than 1600sp's.

R6xx to R7xx transistion more than doubled from 320 to 800. 1200 sp's doesn't make too much sense. And how saturated were the 16ROP's of the R7xx series? Will doubling this to 32 double performance? I dunno. It could I guess, but it can also be a case of diminishing returns.

Broken Record time: "We'll see soon enough". :::cringe::: :D
 

dguy6789

Diamond Member
Dec 9, 2002
8,558
3
76
ATI has been stuck on 16 raster operators for a long time now. Maybe it's due for an increase? Radeon 8500 had 4, 9700 Pro had 8, X800 had 16, X1900 had 16, HD2900 had 16, HD3870 had 16, and the 4870 has 16.
 

BFG10K

Lifer
Aug 14, 2000
22,709
2,971
126
Regarding shaders and ROPs, if they increase their work per clock then they don?t necessarily need more of them.
 

bunnyfubbles

Lifer
Sep 3, 2001
12,248
3
0
Originally posted by: deimos3428
I'm putting my money on 2048 SPs. It's just a nice round number.

so is 2000. In fact, we went from 320 to 800, a factor of 2.5x. Of which 2.5 times 800 would be 2000...

Although, like BFG10K said, there might not be a need for so many shaders if other areas of the GPU are improved and made more efficient.
 

MODEL3

Senior member
Jul 22, 2009
528
0
0
My original prediction (when I learned the 4890 specs) was something like that:

32ROPs/64TUs/1280 SPs (or 640 SPs at 2X the main GPU clock)

So I was predicting essentially that the ratio of TU/ROP and SP/ROP will be lower in relation with RV770, which is not exactly a smart prediction, but anyway.

So I was predicting that the ratio will be -20%.

Nvidia did the same when they released the GT200 architecture in relation with G92b architecture. (they lowered the ratio, the SPs & the TUs are not exactly the same but they are very close)
G92b had nearly +20% ratio of SPs/ROPs and way higher TU/ROP ratio than GT200. (of cource for all the above examples we are talking per MHz)


The reason I predicted that way was, because I think that for some of the games of 2010 the 16 ROPs will be a problem at 1920X1200 unless they hit 900-1GHz at the standard model(which is not that likely in my opinion, since this will be a new design so it would be more difficult to be achieved in relation with 4890, also the 1GHz 4890 is the highest overclocked option)

and also I thought that probably, the sooner the better for ATI to catch 32ROPs becauce in the future (2010-2011) the difference you are going to get from the jump to 32ROPs will not be as high as today because the games will be limited in other areas more than ROP limited, in relation with games now! (the resolution status in 2010-2011 will not be different than today i think)

So I though the timing was right.
The jump from 55nm to 40nm is going to be a full cycle so in relation with a half cycle it has more possibilities, in order to be achieved a jump to DX11 & going to 32ROPs. (Although I think that it will be difficult to be achieved, meaning that the die will be much higher than RV790 that launched at 250$)

Also the jump from GDDR3 to GDDR5 in Q2 2008 brought a sudden very high increase in memory bandwidth that we will not see again (I mean the sudden increase) until GDDR6 or whatever it will be called, so another good reason why a Q3-Q4 2009 32ROPs GPU will have a good timing!

I didn't predicted something like a DX11 4890x2 single GPU (32/80/1600) because the transistors count didn't add up. (at 40nm & at 300$, I mean that the die would be too large in order ATI to sell it at 300$)



But seeing that "evergreen" picture I don't think that my prediction has any chance!

 

AzN

Banned
Nov 26, 2001
4,112
2
0
If the specs are true it's going to be a strong performer. Considering Core is what makes the biggest difference with right combination of bandwidth at least games right now. Something in the lines of 2x the performance of 4870. The bandwidth could be a bit more considering the rop and tmu has raised 2x+ more but definitely enough for a card with those specs. It looks like ATI is finally learning that TMU really makes dramatic difference as they learned with their RV670 to RV770 transition.
 

MODEL3

Senior member
Jul 22, 2009
528
0
0
Sorry I just saw the link.


So they are talking about a:

32ROPs/96 TUs/1200 SPs.

Well it possible but there are some peculiar data.

1. The specs suggest that ATI will double the ROPs while in the same time increasing the TU/ROP ratio.

Why ATI need to even do that since for the resolution 1920X1200 they can cover any 2010 game with 32ROPs/64TUs (just my speculation)
For 2560X1600 they have the X2 model.

Also increasing the TU from 64 (my scenario) to 96 (at 900Mhz) they will need even more bandwidth so with 140,8 GB/sec even my scenario (64 TUs) will be just a little bit bandwidth limited already(in the 900Mhz case) (but limited in lesser extent than what the 96TU specs)

Also with their specs the die will be very big, I am not sure that they can hit 900MHz in the standard configuration (they say 950MHz for X2 model)

Now about the SPs, if the architecture of the SPs is the same with RV790 they will need 96*5=480 number multiple because the SPs are tied to the T.U. , so they will need 960 or
1440. (I am not sure but I think that is the case)

If the SPs are seperated in this architecture from the TU then the 1200 number may be correct, but why to change the architecture if they don't increase the SPs clock like NV did.
(the specs state 900Mhz shader clock so the same as GPU clock)

So the specs suggest a huge decrease between SP/TU ratio in relation with RV790.




 

Keysplayr

Elite Member
Jan 16, 2003
21,209
50
91
NV increased the TU from G80 to G92 while still maintaining the same 128sp's. I don't see why ATI couldn't do the same. I'll be very surprised if we see R8xx debut with only 1200 shaders. But then again, doubling the ROP's and TU's and adding 50% more shaders will significantly increase transistor space/die size. 1200 may be as high as they dare to go if they wish to maintain their "small die" initiative.
 

MODEL3

Senior member
Jul 22, 2009
528
0
0
Originally posted by: Keysplayr
NV increased the TU from G80 to G92 while still maintaining the same 128sp's. I don't see why ATI couldn't do the same. I'll be very surprised if we see R8xx debut with only 1200 shaders. But then again, doubling the ROP's and TU's and adding 50% more shaders will significantly increase transistor space/die size. 1200 may be as high as they dare to go if they wish to maintain their "small die" initiative.

8800GTS (G92) launched after 1 year from the release of 8800GTX (G80).
They maintained the 128SP's but the price went from 600$ (700+ for ultra) to 300$ (8800 GTS 512MB) in a year.

G92 has 64 TU and G80 has 64 TU but:
If you are gaming with good quality features turned on,
then for example if you turn on anisotropic filtering(even at 2x) the G92 drops to 32 TU but for the G80 it makes no difference.

So essentially (since we are gamers right?) the G92 has half the TU in relation with G80 and if we are talking about TU/ROP ratio the G80 has +33% in relation with G92.

The thing is, since RV790 16ROPs & 40 TUs can do the trick for 2008-2009 games why can't a future DX11 GPU with 32ROPs & 64 TUs do the trick also for 2009-2010 games, why ATI will need to increase the texture units from 40 all the way to 96? (also aren't they gonna be bandwidth limited with 140Gb/sec?, why waste resources?)

I may be wrong I'm just asking.


 

AzN

Banned
Nov 26, 2001
4,112
2
0
Originally posted by: MODEL3
Sorry I just saw the link.


So they are talking about a:

32ROPs/96 TUs/1200 SPs.

Well it possible but there are some peculiar data.

1. The specs suggest that ATI will double the ROPs while in the same time increasing the TU/ROP ratio.

Why ATI need to even do that since for the resolution 1920X1200 they can cover any 2010 game with 32ROPs/64TUs (just my speculation)
For 2560X1600 they have the X2 model.

Also increasing the TU from 64 (my scenario) to 96 (at 900Mhz) they will need even more bandwidth so with 140,8 GB/sec even my scenario (64 TUs) will be just a little bit bandwidth limited already(in the 900Mhz case) (but limited in lesser extent than what the 96TU specs)

Also with their specs the die will be very big, I am not sure that they can hit 900MHz in the standard configuration (they say 950MHz for X2 model)

No the die will not be big. It's actually quite acceptable considering it will be on 40nm. SP takes up most space in a die and according to these specs it's only 1200SP from 800SP. However those extra TMU will take up some space. It might be a tad bigger than 4870 but not too much.


Now about the SPs, if the architecture of the SPs is the same with RV790 they will need 96*5=480 number multiple because the SPs are tied to the T.U. , so they will need 960 or
1440. (I am not sure but I think that is the case)

If the SPs are seperated in this architecture from the TU then the 1200 number may be correct, but why to change the architecture if they don't increase the SPs clock like NV did.
(the specs state 900Mhz shader clock so the same as GPU clock)

So the specs suggest a huge decrease between SP/TU ratio in relation with RV790.

For RV790 it has 4TMU for every 80SP. While these specs have more than 6TMU for every 80SP. But one thing I found out trying to figure out the SP TMU ratio is that the math does not add up. Like you said TMU and SP might be separated but I highly doubt it when you consider the math and history of these RV790 chips.

It should be 90TMU not 96. These specs seem bogus.
 

AzN

Banned
Nov 26, 2001
4,112
2
0
Originally posted by: Keysplayr
NV increased the TU from G80 to G92 while still maintaining the same 128sp's. I don't see why ATI couldn't do the same. I'll be very surprised if we see R8xx debut with only 1200 shaders. But then again, doubling the ROP's and TU's and adding 50% more shaders will significantly increase transistor space/die size. 1200 may be as high as they dare to go if they wish to maintain their "small die" initiative.

ATI has done it already. 4670 has more TMU per SP than RV770/790.
 

Keysplayr

Elite Member
Jan 16, 2003
21,209
50
91
Originally posted by: MODEL3
Originally posted by: Keysplayr
NV increased the TU from G80 to G92 while still maintaining the same 128sp's. I don't see why ATI couldn't do the same. I'll be very surprised if we see R8xx debut with only 1200 shaders. But then again, doubling the ROP's and TU's and adding 50% more shaders will significantly increase transistor space/die size. 1200 may be as high as they dare to go if they wish to maintain their "small die" initiative.

8800GTS (G92) launched after 1 year from the release of 8800GTX (G80).
They maintained the 128SP's but the price went from 600$ (700+ for ultra) to 300$ (8800 GTS 512MB) in a year.

What does this have to do with anything?

G92 has 64 TU and G80 has 64 TU but:
If you are gaming with good quality features turned on,
then for example if you turn on anisotropic filtering(even at 2x) the G92 drops to 32 TU but for the G80 it makes no difference.

G80 had 32 Texture Address units and 64 Texture Filtering units (32/64). G92 increased the Texture Address units to 64 for a 1:1 ratio with Texture Filtering units. (64/64).

So essentially (since we are gamers right?) the G92 has half the TU in relation with G80 and if we are talking about TU/ROP ratio the G80 has +33% in relation with G92.

You have this reversed. G80 has half the TU (Texture Address units) in relation to G92. Still, what does any of this have to do with the ROP's?

The thing is, since RV790 16ROPs & 40 TUs can do the trick for 2008-2009 games why can't a future DX11 GPU with 32ROPs & 64 TUs do the trick also for 2009-2010 games, why ATI will need to increase the texture units from 40 all the way to 96? (also aren't they gonna be bandwidth limited with 140Gb/sec?, why waste resources?)

Why not? What is the big deal even if they went to 128 TUs ? Whether they are used or not could very well depend on the application. Explain how this would become a bandwidth limitation?

I may be wrong I'm just asking.

I understand you're just asking questions. But you're all over the map. FOCUS!!! :D

 

Janooo

Golden Member
Aug 22, 2005
1,067
13
81
Some rumoured specs from China: http://itbbs.pconline.com.cn/diy/10432745.html

----- AMD RV8XX gossip on some of the actual situation
September 17 in San Francisco issued a formal version of a full range of evergreen.
AMD DX11 graphics chips into high-end, performance, mainstream, entry-??, high-end, code-named Cypress, the performance of code
Codenamed Redwood, are the mainstream of the two code-named Juniper and Cedar, entry, code-named Hemlock.
MCM designs using Cypress, Redwood by two components, Redwood chip size of 300 mm2,
Juniper chip size of 181 mm2, Cedar die size 120 mm2. Currently, AMD DX11 performance and mainstream
Products has proceeded smoothly, but the flagship product in the chip package Cypress encountered a little problem, Cypress
May be deferred to the AMD DX11 performance, after the mainstream market.
 

OCGuy

Lifer
Jul 12, 2000
27,227
36
91
In the CPU forum, Aigomorla gets CPUs well before the NDA is up. We need someone like that here for the GPUs. :p
 

thilanliyan

Lifer
Jun 21, 2005
11,871
2,076
126
Originally posted by: OCguy
In the CPU forum, Aigomorla gets CPUs well before the NDA is up. We need someone like that here for the GPUs. :p

But he doesn't say anything until NDA is up anyway... :)
 

OCGuy

Lifer
Jul 12, 2000
27,227
36
91
Originally posted by: thilan29
Originally posted by: OCguy
In the CPU forum, Aigomorla gets CPUs well before the NDA is up. We need someone like that here for the GPUs. :p

But he doesn't say anything until NDA is up anyway... :)

I cede your point.
 

MODEL3

Senior member
Jul 22, 2009
528
0
0
Originally posted by: Azn
No the die will not be big. It's actually quite acceptable considering it will be on 40nm. SP takes up most space in a die and according to these specs it's only 1200SP from 800SP. However those extra TMU will take up some space. It might be a tad bigger than 4870 but not too much.

I didn't say that the die will be to big and stoped there.

DX10 4770 (16/32/640) with 128bit memory bus is 826 million transistors & 137mm2 at 40nm.
My estimation for a DX11 4770 is something like 180mm2 at 40nm. (long talk)

And since what I proposed was essentially something like a DX11 4770X2 (32/64/1280)single GPU, my estimation for my scenario was around 300-315mm2 at 40nm, which is big enough! (5/3 - 7/4 , long talk again)

In the scenario this site suggested the SPs was 1200 so only -80SPs (around -40 million transistors) in relation with my scenario.
But this is not the case because if they seperate the SPs from the TUs probably they will need additional transistors for the intercommunication.

Even if they don't need extra transistors, the extra 32TUs that they have in relation with my scenario is more than 40million transistors. (the absolute minimum value i could figure was 64million transistors for the 32TUs and I think the real figure is much higher than that)
So their scenario will have higher die area than mine.

I said that the die will be too big in order to hit for the standard model 900MHz (and 950MHz for the X2) not too big in a general sense.

I also said that, for my scenario, which the die is a little bit smaller.

This is what I said for their scenario:

Originally posted by: MODEL3
Also with their specs the die will be very big, I am not sure that they can hit 900MHz in the standard configuration (they say 950MHz for X2 model)

And this is what i said for my scenario:

Originally posted by: MODEL3
The reason I predicted that way was, because I think that for some of the games of 2010 the 16 ROPs will be a problem at 1920X1200 unless they hit 900-1GHz at the standard model(which is not that likely in my opinion, since this will be a new design so it would be more difficult to be achieved in relation with 4890, also the 1GHz 4890 is the highest overclocked option)

I agree with your point that:

It's actually quite acceptable considering it will be on 40nm.



In no way i'm implying that i am right and you are wrong, i just wanted to show you the way of my thinking!




 

MODEL3

Senior member
Jul 22, 2009
528
0
0
First of all, I would like to repeat what I said in the end of my previous reply, because you seemed a little bit offensive, i don't know why:

Originally posted by: MODEL3
I may be wrong I'm just asking.

Also your comment in bold text:

Originally posted by: Keysplayr
I understand you're just asking questions. But you're all over the map. FOCUS!!! :D

was not exactly a nice thing to say, since i was only asking!
You could just ask me to explain what I meant!

Originally posted by: Keysplayr
What does this have to do with anything?

Well I didn't mean that you were wrong in anything, i just meant that since:
we were talking about 300$ to 300$ price range ATI transitions,

in your example (600$ to 300$ price range Nvidia transition) the task to conclude some valuable things regarding ATI DX11 transition will be a more difficult thing and maybe it was better to use another example (in order to make your point more clear)

Originally posted by: Keysplayr
G80 had 32 Texture Address units and 64 Texture Filtering units (32/64). G92 increased the Texture Address units to 64 for a 1:1 ratio with Texture Filtering units. (64/64).
You have this reversed. G80 has half the TU (Texture Address units) in relation to G92. Still, what does any of this have to do with the ROP's?

Yes you are correct, I had forgot this thing. Like I said I was just asking, because although the above information is all over the net, I am to lazy to check so i write from memory.

I didn't say only ROP's, I said TU/ROP ratio!

I will explain what I meant:

In the example you gave the G80-G92 transition increased the TU/ROP ratio and also the SP/ROP ratio and this was for a 2006 to 2007 transition,

Why do you think, that in this time, for a 2008 to 2009 transition (2 year gap in relation with G80-G92 transition) the DX11 games that are going to be launch will need:

1.a increase in the TU/ROP ratio (since the TU/ROP ratio of RV790 is for example already way way higher than the GT200 TU/ROP ratio)

2.but not increase at all the SP/ROP ratio? (in fact in the site's scenario the SP/ROP ratio is decreased considerably in relation with RV790 SP/ROP ratio)

Don't you think that for the games that are going to be realised this upcoming period the need will be actually to increase the SP/ROP ratio rather than to decrease it?

Also if this is your opinion don't you think that this scenario (where we have doubling of the ROPs and a 2,4X increase in TUs in relation with RV790) will be bandwidth limited?

(only +12% increase of the bandwidth in relation with 4890) Why ATI to waste resources since the design will be bandwidth limited?

4890 850MHz GPU core 16ROPs/40TUs 125GB/sec

5870 900MHz GPU core 32ROPs/96TUs 140GB/sec

Originally posted by: Keysplayr
Why not? What is the big deal even if they went to 128 TUs ? Whether they are used or not could very well depend on the application. Explain how this would become a bandwidth limitation?

I thought that if you increase the texture units (texel fillrate essentially) you are going to need more bandwidth in order those texture units to work optimally.
Why increase the TUs so much if the texture units are not going to deliver the full potential that they have?

About the 128 TUs, you are implying that if the TUs/ROPs ratio is 4:1 (128:32) the design will not need additional bandwidth in relation with my scenario (2:1, 64:32) for example?
We are talking about optimal design, why ATI waste million of transistors for a feature that is not going to deliver its full potential since they can devote those transistors for things such SPs?



Again I'm not sure, just asking, I am not trying in any way to say that you are wrong and i am right, i just want to understand the way you think, in order to evaluate better the site's proposed specs!