AMD Polaris Thread: Radeon RX 480, RX 470 & RX 460 launching June 29th

Page 129 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Status
Not open for further replies.

Stuka87

Diamond Member
Dec 10, 2010
6,240
2,559
136
32 ROPs is the new rumour?

ROPs and bandwidth are very important. You cannot downplay this unless AMD has made very, very large strides over Tonga.

Let me demonstrate how to cripple a card with ROPs and Bandwidth.

Tonga claims to have improved ROPs over earlier GCN as well as memory compression. Despite these GCN 1.2 improvements, the older GCN 1.1 Hawaii seems better than it is relative to the 380X precisely because of ROPs and Bandwidth.

ASUS 380X 1030MHz 32 CU's
290 reference 947MHz 40 CU's

This is only a 15% TFLOP advantage for the 290 over the factory OC 380X.

Yet the 290 is 27% faster:
https://tpucdn.com/reviews/ASUS/R9_380X_Strix/images/perfrel_2560_1440.png

Twice the ROPs and 75% more bandwidth matters. A lot. It really, really matters. This is clear demonstration that it allows those Compute Units the room to work without bottlenecks. Remember, Tonga already has improvements over Hawaii.

So, I post this to ponder if the improvements with Polaris 10 are enough. It has more bandwidth now, and the ROPs are higher clocked, but could these limitations be holding back the potential of AMD's new arch?

Considering this card is targeted at 1080P users, 32 rops should not be an issue. Memory bandwidth may be however, so over clocking the memory may show some good gains.
 

flopper

Senior member
Dec 16, 2005
739
19
76
That looks a reasonable comparison for a new $200 GPU with pre-release drivers and unknown clocks against a $300 OC GPU with fine-tuned, stable drivers. Thanks for posting that.

Paid focus group is likely.



Trolling is not permitted here.
-Rvenger
 
Last edited by a moderator:

crisium

Platinum Member
Aug 19, 2001
2,643
615
136
There's really no way of knowing whether those results were due to ROPs or not, since the Tonga drivers were quite immature at the time (and arguably still are), and thus results tend to be a bit all over the place.

For instance in your graph the 290 is 55% faster than the 285, but today it's only 45% faster. Assuming the 380X had a similar boost in performance with the new drivers (TPU doesn't include it in their newest tests sadly), then the 290 would only be about 19% faster, not 27%.

19% faster is only 4% more than we would expect by simply looking at the TFLOP numbers, which is well within the margin of error.

Well, that is a 380X review in Novemember 2015 - 14 months after the 285 launched so Tonga was not new.

And it's harder to get exact percentages from TPU when one card is not at 100%, due to rounding. Add 1% to the 290 and subtract 1% from the 285 from your link and you get 53%. 45-53% from rounding.

But that being said I think it's more to bandwidth than ROPs, personally, thinking about it further. 2.7% gain for 15% bandwidth OC which isn't spectacular but that's still much less bandwidth than Tahiti has, let alone Hawaii.
https://www.techpowerup.com/reviews/ASUS/R9_380X_Strix/26.html

Who knows.

All I know is Tonga "IPC" is a a good improvement over older GCN but it never seemed to shine compared to Hawaii.
http://www.hardware.fr/articles/926-24/tonga-vs-tahiti.html
 
Last edited:

antihelten

Golden Member
Feb 2, 2012
1,764
274
126
That looks a reasonable comparison for a new $200 GPU with pre-release drivers and unknown clocks against a $300 OC GPU with fine-tuned, stable drivers. Thanks for posting that.

Also worth noting that Nvidia tends to have an advantage in GTA V with a stock 970 matching a 390X:

gtav_1920_1080.png


So a 970 at 1522 should be about 20% faster than a stock 390X.

Also the 970 is running at what looks like 70% extended distance scaling versus 100% on the RX 480 video. This would give the 970 a performance boost of about 14%.

Combine all of this and we should expect the 970 in the video to be ahead by about 35-40% if the performance of the RX 480 actually matches that of the 390X.

And it's harder to get exact percentages from TPU when one card is not at 100%, due to rounding. Add 1% to the 290 and subtract 1% from the 285 from your link and you get 53%. 45-53% from rounding.

First of all they shouldn't vary by more than 0.5% from their true value unless TPU screwed up their rounding somehow, so worst case scenario would be adding 0.5% to the 290 and subtracting 0.5% from the 285.

This problem also exists when one of the GPUs is at 100%, since the other GPU (the one not at 100%) will be also be a rounded value (i.e. the 127% number in you may be anywhere from 126.5 to 127.5).

So in my link the true value could be anywhere from 41.8% to 49.2% and your link anywhere from 53.3% to 56.4%. So an improvement for the 285 of somewhere between 4.1 percentage point and 14.6 percentage point (2.7-10.3%).

All I know is Tonga "IPC" is a a good improvement over older GCN but it never seemed to shine compared to Hawaii.
http://www.hardware.fr/articles/926-24/tonga-vs-tahiti.html

Tonga just seems to be extremely sensitive to the exact game being tested, with it performing like crap in some games and great in others.
 
Last edited:

Face2Face

Diamond Member
Jun 6, 2001
4,100
215
106
That looks a reasonable comparison for a new $200 GPU with pre-release drivers and unknown clocks against a $300 OC GPU with fine-tuned, stable drivers. Thanks for posting that.

Thanks. I did it per Sweeper's request, and now I got AMD die hards all over me... :\ My guess is with review drivers and a decent OC, these cards will be pretty amazing. I plan on getting one, I mean why not? it's $230.

Also the 970 is running at what looks like 70% extended distance scaling versus 100% on the RX 480 video. This would give the 970 a performance boost of about 14%.

Combine all of this and we should expect the 970 in the video to be ahead by about 35-40% if the performance of the RX 480 actually matches that of the 390X.

I tested the extended distance performance scaling myself, and the results are not that drastic. I tried it in the city and in the grassy areas. I turned it off completely, and then enabled it again to the percentage in the video. I only saw a 2 FPS difference. If you have the game I would urge you to try it out yourself.
 
Last edited:

crisium

Platinum Member
Aug 19, 2001
2,643
615
136
Considering this card is targeted at 1080P users, 32 rops should not be an issue. Memory bandwidth may be however, so over clocking the memory may show some good gains.

Technically AMD's marketing positions it for 1440p or VR, but you're right from a practical standpoint it's for 1080p.

Agree that bandwidth may be more important, since from seeing the Tonga memory OC gains and how far the bandwidth still is from Tahiti nevermind Hawaii that it might very well make up the TFLOP performance deficit if it had as much as the 290.

So I sort of take the post back. It's more a bandwidth limitation demonstration, I'd wager.

Too bad there's no real way to test ROP performance alone for games. There are fillrate tests that isolate it, but for actual gaming there is no one to test since AFAIK you cannot change the clocks separately and there is also no two cards with the same specs other than ROPs.
 

Headfoot

Diamond Member
Feb 28, 2008
4,444
641
126
Thanks. I did it per Sweeper's request, and now I got AMD die hards all over me... :\ My guess is with review drivers and a decent OC, these cards will be pretty amazing. I plan on getting one, I mean why not? it's $230.



I tested the extended distance performance scaling myself, and the results are not that drastic. I tried it in the city and in the grassy areas. I turned it off completely, and then enabled it again to the percentage in the video. I only saw a 2 FPS difference. If you have the game I would urge you to try it out yourself.

Distance scaling is also CPU relevant, so your beefy CPU may be playing a role here
 

Abwx

Lifer
Apr 2, 2011
11,855
4,832
136
Thank you! Impressive showing for OCed Geforce GTX 970. :)

At 250W power drain one could expect it to match a card that is running at 100W, that s way more than 2x the perf/Watt for the 480, or is perf/Watt no more relevant when it comes to Maxwell..???..
 

Thala

Golden Member
Nov 12, 2014
1,355
653
136
Technically AMD's marketing positions it for 1440p or VR, but you're right from a practical standpoint it's for 1080p.

And still resolution does not matter as long as you do not decrease quality options significantly. Decreasing quality would shift the load towards the ROPs to the point that you might get ROP limited. But then again, in this case it makes no sense to increase resolution and sacrifice quality.

As i mentioned before, with 32 ROPs and only 2306 shader units you are not going to be fill-rate limited for modern games and reasonable high settings. Hence increasing number of ROPs has no impact.
 

crisium

Platinum Member
Aug 19, 2001
2,643
615
136
No impact, as in literally 0.0%, or an opinionated minor impact?

Quite the difference.
 

Shivansps

Diamond Member
Sep 11, 2013
3,918
1,570
136
Technically AMD's marketing positions it for 1440p or VR, but you're right from a practical standpoint it's for 1080p.

Agree that bandwidth may be more important, since from seeing the Tonga memory OC gains and how far the bandwidth still is from Tahiti nevermind Hawaii that it might very well make up the TFLOP performance deficit if it had as much as the 290.

So I sort of take the post back. It's more a bandwidth limitation demonstration, I'd wager.

Too bad there's no real way to test ROP performance alone for games. There are fillrate tests that isolate it, but for actual gaming there is no one to test since AFAIK you cannot change the clocks separately and there is also no two cards with the same specs other than ROPs.

Its what im saying, compared to 390s the RX480 is lossing 130GB/s, they are betting on Color Compression to help with that, and its gona help, but from what has been show on Carrizo, it cant do magic.
 

sze5003

Lifer
Aug 18, 2012
14,304
675
126
Will any of these have dvi out? For some reason my monitor doesn't have HDMI. I do have a display port to dvi adapter though. Not sure if it's display 1.3 or 1.4 adapter.
 
Last edited:

antihelten

Golden Member
Feb 2, 2012
1,764
274
126
I tested the extended distance performance scaling myself, and the results are not that drastic. I tried it in the city and in the grassy areas. I turned it off completely, and then enabled it again to the percentage in the video. I only saw a 2 FPS difference. If you have the game I would urge you to try it out yourself.

So if I'm understanding you correctly, you tested the difference between 70% (the highest you could enable on the 970) and 0%, and you did so in the actual game and not in the build in benchmark?

If so then it's not really comparable to what was done in the RX 480 video. Different settings, different scenes plus it's also worth noting that the build in benchmark in GTA 5 can vary quite a bit (plus/minus 5 FPS per run according to Nvidia).

Also your results with extended distance performance definetly don't seem to match what others are experiencing, with some reporting a 10 FPS drop just going from 0% to 50% (HOCP reports a similar hit). Most people seem to report this setting as the number one performance killer and the first thing to turn down if you need better performance.

Lastly I do have the game but I only have a 2GB GPU (770) so I can't really test the extended difference at 100%.
 
Last edited:

Shivansps

Diamond Member
Sep 11, 2013
3,918
1,570
136
Will any of these have dvi out? For some reason my monitor doesn't have HDMI. I do have a display port to dvi adapter though. Not sure if it's display 1.4 or 1.4 adapter.

You can use a simple HDMI to DVI cable, or a HDMI to DVI passive adapter.
 

Concillian

Diamond Member
May 26, 2004
3,751
8
81
32 ROPs is the new rumour?

ROPs and bandwidth are very important. You cannot downplay this unless AMD has made very, very large strides over Tonga.

Let me demonstrate how to cripple a card with ROPs and Bandwidth.

[...]

This may be true, but keep in mind that ROPs scale with frequency as well, and the clocks are higher for everything on Polaris. The ratio of ROPs to CUs is important, and the ratio is not going to be much different from 380x ratios. I doubt AMD will severely cripple the card with ROPs or memory bandwith. Some, yes, of course, but it should be pretty minor.

This is what is done on midrange cards. They make cuts in the right places that makes a relatively small trade off of performance for the die size (cost) reduction. It's not 'crippled' so much as optimized more with cost in mind (vs. the high end cards)

It's also why people shouldn't have crazy high expectations. 390x-ish speed ±a few percent is what I've been saying since the beginning of the thread, and I still believe that.
 

kraatus77

Senior member
Aug 26, 2015
266
59
101
380x has 63% less bandwidth than 280x and it performs same, 480x has 66% less bandwidth than 390x and more improvements, so i think it can match it. don't forget tonga has more efficient rops already.

78751.png
 

Face2Face

Diamond Member
Jun 6, 2001
4,100
215
106
So if I'm understanding you correctly, you tested the difference between 70% (the highest you could enable on the 970) and 0%, and you did so in the actual game and not in the build in benchmark?

If so then it's not really comparable to what was done in the RX 480 video. Different settings, different scenes plus it's also worth noting that the build in benchmark in GTA 5 can vary quite a bit (plus/minus 5 FPS per run according to Nvidia).

Also your results with extended distance performance definetly don't seem to match what others are experiencing, with some reporting a 10 FPS drop just going from 0% to 50% (HOCP reports a similar hit). Most people seem to report this setting as the number one performance killer and the first thing to turn down if you need better performance.

Lastly I do have the game but I only have a 2GB GPU (770) so I can't really test the extended difference at 100%.


Correct, I did my testing in game. However, since you mentioned it, I went ahead and and did the benchmark run with EDS disabled, and then set to 70%

The result on the right is with it disabled, and the one on the left is enabled @ 70%.

k3EKtC4.png
 

Stuka87

Diamond Member
Dec 10, 2010
6,240
2,559
136
380x has 63% less bandwidth than 280x and it performs same, 480x has 66% less bandwidth than 390x and more improvements, so i think it can match it. don't forget tonga has more efficient rops already.

We also have to remember that a lot of AMD cards have way more bandwidth than they can use. This was especially true for Tahiti, where a memory overclock gave almost no performance boost. Hawaii also has gobs of bandwidth. So sure Polaris has a fair amount less, but having less than a card that had more than it needed may result in not having much of an impact.
 

Thala

Golden Member
Nov 12, 2014
1,355
653
136
No impact, as in literally 0.0%, or an opinionated minor impact?

Well it is rather a minor impact. Part of the reason is, that you design hardware not for worst case. (at least not in consumer space).

Given a certain compute performance say number of shaders, you chose the number of ROPs in such way, that you are never limited on ROP performance in 9x% of the cases in 9x% of actual games for 9x% typical game settings. Keep in mind that the quotient shaders/ROP goes up over time, making less ROPs necessary per shader. After this you can determine bandwidth by looking at worst case ROP operation (e.g. a blend operation). Again you can assume that only a certain % of all ROP operations are blend operations. At this point you have overall bandwidth. (This is somewhat simplified as there are other bandwidth consumers like the TMUs, but ROPs should consume majority of bandwidth)

Now if you would design for worst case and apply the same methodology starting with same amount of shaders, you would get HW with c times the number of ROPs and 2c times the bandwidth. However with this HW you would see neglible performance gain for the majority of cases.

What we can take away here is, that there are 2 key metrics given number of shaders: ROP/shader (higher is better) bandwidth/ROP (higher is better) but performance impact of both metrics have a very sharp decline after some point. (e.g. diminishing returns).
The second take away is, that the point of diminishing returns for ROP/shader decreases as games get more complex.

disclaimer: I included stencil and z-buffer operations as part of ROP.
 
Last edited:
Status
Not open for further replies.