Originally posted by: Janooo
4670 has some other limitations and it's not high end part.
I read somewhere that AMD would like to keep 4:1 ratio.
Originally posted by: jaredpace
I'm expecting 1.2 times faster performance out of 5870 compared to 4870 - Unless they move to 512bit mem bus.
the shaders i bet move from 800 to 960 per core. And the GDDR5 is a little faster.
What i am trying to say is that the memory interface is the defining characteristic, the rop count is just a secondary thing. Years of 16 rops/256bit is an artefact of the gddr3 interface and the speed it runs at. The RV770 memory controller added compatability for gddr5 but does not appear tuned for it as most of the RV770s were still expected to be used with gddr3.Originally posted by: Azn
What are you trying to say? You have changed the subject at hand. First you were talking about how you can change ROP size with gddr3 and gddr5 memory interface now you are speculating that 4870 is tuned for gddr3 when it has GDDR5 memory Interface?![]()
If its at 40nm RV770 the chip with a perfect shrink it will go from ~260mm2 -> 140mm2. Unless packaging improves dramatically that is too small for a 256bit interface let alone a 384 bit one. Maybe if you put 2 together somehow on one die could get it big enough for a 384bit bus....2 billion transistors for sure. Bonus smoke alarm and fire extinguisher with every saleATI will compensate again by raising Texture count, SP, even ROP. Kind of like what they did with RV770. I don't see what the problem is.
Its a $2000 card with 1.5gb of memory, why would you disable part of the memory interface to slow it down?Here: Nvidia Quadro CX According to specs its 192 shaders and 384 bit bus...not the same proportions as the GTX260(192shaders and 448bit bus). Memory interface is reduced for some reason.
Considering memory controller isn't tied down to SIMD core I don't see what the problem is.
What i am trying to say is that the memory interface is the defining characteristic, the rop count is just a secondary thing. Years of 16 rops/256bit is an artefact of the gddr3 interface and the speed it runs at. The RV770 memory controller added compatability for gddr5 but does not appear tuned for it as most of the RV770s were still expected to be used with gddr3. If they gamble and say the RV870 will largely be used with gddr5 then the rops are candidate to be redesigned to handle double the throughput(ie 2x rops or rops that are twice as strong). It depends i guess when the RV870 is expected to debut if its soon then gddr3 will still likely be around, if its late then they have the time to redo the rops. Sorry for not explaining this well
If its at 40nm RV770 the chip with a perfect shrink it will go from ~260mm2 -> 140mm2. Unless packaging improves dramatically that is too small for a 256bit interface let alone a 384 bit one. Maybe if you put 2 together somehow on one die could get it big enough for a 384bit bus....2 billion transistors for sure. Bonus smoke alarm and fire extinguisher with every sale
Its a $2000 card with 1.5gb of memory, why would you disable part of the memory interface to slow it down? The leaked photos of the 55nm GT200 show a chip almost the same size as the G80 which also had a 384bit bus. Isnt it logical that that pad limiting would cause this? Although looking here the R600 was smaller and still managed to fit a 512bit interface somehow.
I've shown pictures on both Nvidia and ATI cards. This has been continued from past to present. Nothing has changed.
260mm2 chip becomes 140mm2 because it uses 30% less process?
ATI is going to release the exact same card without more SP texture units to the chip? Why?
Memory bandwidth has very little to do with crunching numbers. It's the SP that is doing all the work with Cuda.
I think you've got this pad limiting thing way out of proportion honestly
Right, so you can?t solely rely on the FLOPs metric to make predictions about the prevalence of shaders in programs. That?s the point.Originally posted by: Azn
It's all theoretical even with nvidia's hardware but then when we put it to use with a game it sure doesn't act that way.
Actually the bulk of the heavy lifting is done by the driver compiler, not the game.Unless all the shader games were specifically optimized for ATI SP. We can see many instances where a shader heavy games favor ATI R7XX cards.
Such an argument is wrong. Just because it always has been like that doesn't mean it has to be like that. It's likely, but not necessary. Rjc's argument is that 16 ROPs for 256 Bit was just the best balance between performance and cost, considering the data rate of the memory and the chip clock speed. Now that the data rate has almost doubled, but the chip clock stays about the same, they may want to change this balance (or make the ROPs more powerful). I don't know if this is possible though.
55 nm / 40 nm = 0.72. Scaling happens in both lateral directions and since we're interested in the area we have to square this value, giving 0.53. 260 mm^2 * 0.53 = 137.5 mm^2. Clear enough? (note that this asumes perfect scaling, reality will be a bit worse than that)
He doesn't say that. It's just an example. Anyway, the German article says die size will be 205 mm^2, so they do add features. Which should already have been obvious from the claim of DX 11 and >1000 SP
I'm sure he's aware of that. His point is that if people are paying 2000$ for such a card, why shouldn't they get the fully fledged 240 SP / 512 Bit mem configuration? (Maybe that's available for 3000$ ..
Applying it to GT200 with its huge die is certainly out of proportion. But if you wanted to feed a 140 mm^2 GPU with a 386 Bit mem bus you'd get into serious trouble with the pads. These 100 - 150W also have to be delivered in some way.
Originally posted by: BFG10K
Actually the bulk of the heavy lifting is done by the driver compiler, not the game.Unless all the shader games were specifically optimized for ATI SP. We can see many instances where a shader heavy games favor ATI R7XX cards.
Right, so you can?t solely rely on the FLOPs metric to make predictions about the prevalence of shaders in programs. That?s the point.
EDIT: As a matter of fact, I KNOW G92 does not work the way you say. There was a limited run on 9600gso's that could be flashed to 8800gt's. the bios on the card only deactivated one of the memory channels making the card 192bit. After the flash you would have 256bit memory but still the same amount of shaders and rops. Someone else correct me if Im wrong.
Originally posted by: Cookie Monster
I dont think nVIDIA chips use any sort of hub architecture.
Plus i dont think the ROPs are tied to the memory controllers either on the RV770.
Originally posted by: Azn
Using fine tuned gddr5 memory interface does not mean you can raise ROP count. It is tied to the memory bit bus. I've shown pictures on both Nvidia and ATI cards. This has been continued from past to present. Nothing has changed.
How are you doing your math? 260mm2 chip becomes 140mm2 because it uses 30% less process?ATI is going to release the exact same card without more SP texture units to the chip? Why? More room means you can add more SP, TMU to the chip.
G80 is 480mm. GT200 on a 65nm is 577mm. So a GT200 with 55nm is about the size of G80 which sounds about right with die shrink. Nvidia did not redesign their chip. What Nvidia does is disable memory bus and rop cluster.
Pad limiting? Considering 2900xt that has 512bit memory bus was only 420mm in size I don't think it has anything to do with pad limits. I think you've got this pad limiting thing way out of proportion honestly. Too much Tom's hardware can do that to you.![]()
Originally posted by: rjc
Originally posted by: Azn
Using fine tuned gddr5 memory interface does not mean you can raise ROP count. It is tied to the memory bit bus. I've shown pictures on both Nvidia and ATI cards. This has been continued from past to present. Nothing has changed.
Ok, working quickly from from the wikipedia page for memory bandwidth:
RV670
3850 GDDR3 53
3870 GDDR3 57.6
3870 GDDR4 72
RV770 (which had ROPs with doubled power according to Rage3d)
4830 GDDR3 57.6
4850 GDDR3 63.6
4870 GDDR5 115.2
According to Nordic Hardware
RV870 GDDR5 up to 150
Surely if the new card goes with this memory they will have to increase the ROPs again or they will lose the balance the RV770 had and end up with a similar situation to the RV670?
They only things i can see stopping them is lack of time, the possibly they will still want to run the RV870 on GDDR3 or finally something in dx11 that changes things.
How are you doing your math? 260mm2 chip becomes 140mm2 because it uses 30% less process?ATI is going to release the exact same card without more SP texture units to the chip? Why? More room means you can add more SP, TMU to the chip.
260mm2 x (40/55)^2 ~ 140mm2
Apparently according to current rumors the first chip on 40nm for ATI will be a die shrunk RV770 which cause of the size will have a 128bit bus on GDDR5. The idea is to replace to 4850 in the lineup with this, the smaller chip + cheaper board offsetting the GDDR5 cost. After follows the RV870 with extra stuff to fill it up to be big enough for 256 bit memory interface.
G80 is 480mm. GT200 on a 65nm is 577mm. So a GT200 with 55nm is about the size of G80 which sounds about right with die shrink. Nvidia did not redesign their chip. What Nvidia does is disable memory bus and rop cluster.
Maybe, they did cut the memory interface when they went from G80->G92 and the GT200 is not selling at the same $ it was introduced at so they would want to be saving money wherever possible.
Pad limiting? Considering 2900xt that has 512bit memory bus was only 420mm in size I don't think it has anything to do with pad limits. I think you've got this pad limiting thing way out of proportion honestly. Too much Tom's hardware can do that to you.![]()
There are a lot more constraints now though: the GDDR5 interface requires 34 pins over GDDR3, the sideport thingy + extra power and ground pins. And most importantly people dont look like they would be willing to pay enough $ for another R600 style chip at the moment.
Here's a beyond3d thread for consideration.