• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

videocardzAMD Radeon R9 290X confirmed to feature 64 ROPs

Page 10 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
No, but there is plenty of evidence that 384 bit bus controllers are capable of memory clocks beyond 7GHz. 352 GB/s bandwidth on a 512 bit bus only requires an effective memory clock of 5.5GHz. 5.5GHz memory clocks have been standard on the 7970 since its release back on December 2011. Progressing to a larger bus size but reverting to slower memory isn't exactly a step forward.

Just for reference, the memory clocks on the GTX 280 and 285 were higher than the smaller 448 bit bus GTX 260. In fact, nothing offered from ATI had faster memory clocks. It wasn't until a year and half later with the release of the 6950 and 6970 that there were higher memory clocks but even then it was on a 256 bit bus using GDDR5 ram.

Huh?

4870/4890 Used GDDR5, GT200 used GDDR3 so RV770 had faster memory, 5870 also had faster memory than GTX480, so did 6970. Overall they had less bandwidth cause of the smaller bus.
 

So looks like neither card was clocked to its average potential.

http://hwbot.org/hardware/videocard/radeon_hd_7970/


No, but there is plenty of evidence that 384 bit bus controllers are capable of memory clocks beyond 7GHz. 352 GB/s bandwidth on a 512 bit bus only requires an effective memory clock of 5.5GHz. 5.5GHz memory clocks have been standard on the 7970 since its release back on December 2011. Progressing to a larger bus size but reverting to slower memory isn't exactly a step forward.

Just for reference, the memory clocks on the GTX 280 and 285 were higher than the smaller 448 bit bus GTX 260. In fact, nothing offered from ATI had faster memory clocks. It wasn't until a year and half later with the release of the 6950 and 6970 that there were higher memory clocks but even then it was on a 256 bit bus using GDDR5 ram.

I think we need to look at other factors besides raw memory bandwidth.

What if the performance difference between 320GB/s and 448Gb/s (5Ghz vs 7Ghz) is next to nothing? Why would AMD spend the money and TDP budget for no performance gain? What if the gain is only 3%?

Going with a wider bus and slower memory makes sense if Hawaii can perform to its fullest with 320GB/s of memory bandwidth. Especially if speculation is correct that the controller footprint will end up being smaller than Tahiti's.
 
Last edited:
What is performance differential at the same clocks between GK104 and GK110?

SPs 1534->2688 +75%
bandwidth 192->288 +50%
TMUs 128->224 +75%
ROPs 32->48 +50%

25% between Titan and GTX680 at 1080p and 33% at 1600p.

Nope, 50%, rather more (GTX 680 likely clocked a bit higher than 1 GHz):
https://www.computerbase.de/artikel/grafikkarten/2013/test-nvidia-geforce-gtx-titan/8/

I benchmarked my Titan with the 680 of a friend, both GPUs ran at 900 MHz. The Titan was about 50-70% faster depending on the game and settings.
 
No, but there is plenty of evidence that 384 bit bus controllers are capable of memory clocks beyond 7GHz. 352 GB/s bandwidth on a 512 bit bus only requires an effective memory clock of 5.5GHz. 5.5GHz memory clocks have been standard on the 7970 since its release back on December 2011. Progressing to a larger bus size but reverting to slower memory isn't exactly a step forward.

Just for reference, the memory clocks on the GTX 280 and 285 were higher than the smaller 448 bit bus GTX 260. In fact, nothing offered from ATI had faster memory clocks. It wasn't until a year and half later with the release of the 6950 and 6970 that there were higher memory clocks but even then it was on a 256 bit bus using GDDR5 ram.

Although both NVIDIA and AMD 384 bit bus are generally paired with memory rated for 6Gbs.

As a matter of fact the 4870 had higher memory clocks since it used GDDR5, just not higher bandwidth.

The GTX480 had a 384 bit bus but used lower memory clocks 3696 vs the 4800 of the 256 bit bus 5870.
 
Nope, 50%, rather more (GTX 680 likely clocked a bit higher than 1 GHz):
https://www.computerbase.de/artikel/grafikkarten/2013/test-nvidia-geforce-gtx-titan/8/

I benchmarked my Titan with the 680 of a friend, both GPUs ran at 900 MHz. The Titan was about 50-70% faster depending on the game and settings.

And do you know what is the differential between R9 290X vs 7970?

At stock from both cards I see 28% and 32%.

The 20-30% claim (which isn't confirmed one or way or another) is stock R9 290X vs stock 7970 GE.

Max clocks we will see.
 
And do you know what is the differential between R9 290X vs 7970?

At stock from both cards I see 28% and 32%.

The 20-30% claim (which isn't confirmed one or way or another) is stock R9 290X vs stock 7970 GE.

Max clocks we will see.

I would expect ~35% better performance per clock, in some instances 40-45. The problem is that we don't know how bandwidth bottlenecked the 290X will be and how the better frontend will affect performance in a variety of cases.
 
I would expect ~35% better performance per clock, in some instances 40-45. The problem is that we don't know how bandwidth bottlenecked the 290X will be and how the better frontend will affect performance in a variety of cases.

Why are you expecting the R9 290X to be bandwidth bottlenecked?

I don't see indications that the 7970 is bandwidth bottlenecked.
 
Just for reference, the memory clocks on the GTX 280 and 285 were higher than the smaller 448 bit bus GTX 260. In fact, nothing offered from ATI had faster memory clocks. It wasn't until a year and half later with the release of the 6950 and 6970 that there were higher memory clocks but even then it was on a 256 bit bus using GDDR5 ram.
As a matter of fact the 4870 had higher memory clocks since it used GDDR5, just not higher bandwidth.
Perhaps Slomo4shO is using the base clock instead of the effective clock.

GTX 285: 1242 MHz (2484 Mbps).
HD 4890: 975 MHz (3900 Mbps).
HD 6950: 1250 MHz (5000 Mbps).
 
Perhaps Slomo4shO is using the base clock instead of the effective clock.

GTX 285: 1242 MHz (2484 Mbps).
HD 4890: 975 MHz (3900 Mbps).
HD 6950: 1250 MHz (5000 Mbps).

I was using base clocks instead of effective clocks. It is the only practical way of comparing clock speeds between GDDR3 and GDDR5 since GDDR5 effective multiplier is twice that of GDDR3.
 
Perhaps Slomo4shO is using the base clock instead of the effective clock.

GTX 285: 1242 MHz (2484 Mbps).
HD 4890: 975 MHz (3900 Mbps).
HD 6950: 1250 MHz (5000 Mbps).

GDDR5 and GDDR3 controllers have different quirks though.

And as shown, the 384 bit bus of the GTX480 used slower memory than the 256 bit bus of the 5870.
 
Why are you expecting the R9 290X to be bandwidth bottlenecked?

I don't see indications that the 7970 is bandwidth bottlenecked.

The 7970 probably isn't, but the 290(X) might be a little.
You'll add roughly 40% more compute power but only 11% more bandwidth. Thus Hawaii might shoot past the spot where it would begin to be bottlenecked.

A GTX 780@900 MHz gains a good 2% with 15% more bandwidth. At 1176 MHz (which yields roughly the same GFLOPs that I would expect from the 290X) it's 4-5%:
http://www.pcgameshardware.de/Grafi...rce-GTX-780-Taktskalierung-im-Test-1082208/6/
Of course that is quite a small gain (I would call it a "soft" bottleneck), but a gain nonetheless. How Hawaii will behave, we'll have to see. A lot depends on the cache system as well.
 
GDDR5 and GDDR3 controllers have different quirks though.

And as shown, the 384 bit bus of the GTX480 used slower memory than the 256 bit bus of the 5870.

Obviously, The 285 still had the highest memory clocks of any GDDR3 card available at the time while supporting a 512 bit bus... The point is that the GDDR5 controllers have matured since GDDR5 adaptation and AMD is bringing a 512 bit bus controller with inferior speeds to the 384 bit bus controller from two years ago. What is the point of increasing the bus size by 33% only to obtain marginally better memory bandwidth? Samsung and Hynix already has GDDR5 memory rated at 1.75GHz (7GHz effect) and such memory on a 384 bit but yields total memory bandwidth of 336GB/s which actually exceeds the projected 320GB/s bandwidth of the upcoming R9 290/290X.

It may be true that the larger bus may improve latency but will this actually correlate with improved performance? Since the base clock speeds are expected to be 1.25GHz(5.0GHz effective) the memory would need to be overclocked to 1.31GHz(5.25GHz effective) to match the bandwidth potential of a 384 bit bus using the 1.75GHz (7GHz effect) memory chips. I understand that this is but a 5% overclock but a 33% increase in memory bus shouldn't require an OC to meet the bandwidth potential of 384 bit card.
 
Last edited:
Obviously, The 285 still had the highest memory clocks of any GDDR3 card available at the time while supporting a 512 bit bus... The point is that the GDDR5 controllers have matured since GDDR5 adaptation and AMD is bringing a 512 bit bus controller with inferior speeds to the 384 bit bus controller from two years ago. What is the point of increasing the bus size by 33% only to obtain marginally better memory bandwidth? Samsung and Hynix already has GDDR5 memory rated at 1.75GHz (7GHz effect) and such memory on a 384 bit but yields total memory bandwidth of 336GB/s which actually exceeds the projected 320GB/s bandwidth of the upcoming R9 290/290X.

It may be true that the larger bus may improve latency but will this actually correlate with improved performance? Since the base clock speeds are expected to be 1.25GHz(5.0GHz effective) the memory would need to be overclocked to 1.31GHz(5.25GHz effective) to match the bandwidth potential of a 384 bit bus using the 1.75GHz (7GHz effect) memory chips. I understand that this is but a 5% overclock but a 33% increase in memory bus shouldn't require an OC to meet the bandwidth potential of 384 bit card.

Maybe its a power usage thing?
ei. 512bit bus slower running ram = less power used, for same performance?

Only thing that makes sense, unless its a cost thingy.....
ei cheaper ram, than those super fast ones?

Anyways Im sure its not just done willy nilly, theres probably a good reason for it.
 
Last edited:
The 7970 probably isn't, but the 290(X) might be a little.
You'll add roughly 40% more compute power but only 11% more bandwidth. Thus Hawaii might shoot past the spot where it would begin to be bottlenecked.

A GTX 780@900 MHz gains a good 2% with 15% more bandwidth. At 1176 MHz (which yields roughly the same GFLOPs that I would expect from the 290X) it's 4-5%:
http://www.pcgameshardware.de/Grafi...rce-GTX-780-Taktskalierung-im-Test-1082208/6/
Of course that is quite a small gain (I would call it a "soft" bottleneck), but a gain nonetheless. How Hawaii will behave, we'll have to see. A lot depends on the cache system as well.

The recent review of the Sapphire R9 280X Toxic does show performance improvements with just a memory overclock at 1100 core clock. The memory overclock(9.375% OC) from effective 6.4GHz (307.2GB/s bandwidth) to 7.0GHz(336GB/s bandwidth, 5% more bandwidth than the stock R9 290) does 2-4% performance gains which implies that there is some memory bottlenecks(at least at 1440P).
 
Last edited:
64 you mean? 64 is correct, yes. I thought after the slide leak that much was certain 🙂

No, im talking about 16x Render Back Ends. 64 is the Color ROPs. I guess they will still have the same RBE implementation of Tahiti, 4 Color + 8 Stencil ROPs each.

If we have 16x RBEs and 2816 SPs with 44 CUs, keeping a dual front end engine, that means there will be 16x groups. Can you confirm 16 groups of CUs (Double of Tahiti) ??
 
Maybe its a power usage thing?
ei. 512bit bus slower running ram = less power used, for same performance?

Only thing that makes sense, unless its a cost thingy.....
ei cheaper ram, than those super fast ones?

Anyways Im sure its not just done willy nilly, theres probably a good reason for it.

Neither one makes sense for a top tier GPU. It appears that this card may be deliberately throttled so that there may be a subsquent release with better performance in the even that Nvidia releases the Titan Ultra. I guess we shall have more details in a week.
 
The recent review of the Sapphire R9 290X Toxic does show performance improvements with just a memory overclock at 1100 core clock. The memory overclock(9.375% OC) from effective 6.4GHz (307.2GB/s bandwidth) to 7.0GHz(336GB/s bandwidth, 5% more bandwidth than the stock R9 290) does 2-4% performance gains which implies that there is some memory bottlenecks(at least at 1440P).

You mean 280X Toxic, right? (for the link)
 
Obviously, The The point is that the GDDR5 controllers have matured since GDDR5 adaptation and AMD is bringing a 512 bit bus controller with inferior speeds to the 384 bit bus controller from two years ago. What is the point of increasing the bus size by 33% only to obtain marginally better memory bandwidth? Samsung and Hynix already has GDDR5 memory rated at 1.75GHz (7GHz effect) and such memory on a 384 bit but yields total memory bandwidth of 336GB/s which actually exceeds the projected 320GB/s bandwidth of the upcoming R9 290/290X.

A few things.

There is only a 512 bit bus GDDR5.

One can claim the 384 bit buses matured since we see the improvement from the GTX480 to today Tahitis and GK110s. No doubts about that.

What problems are associated to 512 bit bus with GDDR5 we don't know.

Second the 512 bit bus is actually smaller in die size to the 384 bit bus of the Tahitis.

So the point is to save die space and save on power/cheaper memory, since 5Gbs GDDR5 chips give 320GB/s bandwidth.

6GBs chips are cheaper than 7GBs.

Additionally 512bit bus allow for a 4GB configuration.

That is the stock value.

If the memory can be overclocked to just 6GHz the bandwidth will be 388GB/s.

By the way the improvements from overclocking the memory from 6.4 to 7 range from 0.7% to 3.5% for that Toxic R9 280X.
 
Last edited:
From my own testing with Crysis 3 @ 1080P high settings and 2XSMAA - I saw a 14% increase going from 5Ghz to 7.4Ghz on the memory with the same core clock. That percentage might grow at a higher resolution? Not all games respond that way though.
 
Back
Top