ATAMD Kaveri Docs Reference Quad-Channel Memory Interface, GDDR5 Option

inf64

Diamond Member
Mar 11, 2011
3,697
4,015
136
Welcome to last year :D
The functionality is fused off, gddr5 support is not there any more :(.
 
Last edited:

know of fence

Senior member
May 28, 2009
555
2
71
AMD obviously went as far with Kaveri graphics as DDR3 would allow (and a bit farther with the 7850k). The R7 250 (90 $) is the last DDR3 card in the AMD lineup, and those cards only have 1800 MHz memory.


Bioshock-FR.png
 

NTMBK

Lifer
Nov 14, 2011
10,232
5,013
136
To make the most of quad channel, they'd need to get the GPU clocks up. Hopefully if the process improves over time they can release a refresh.
 

USER8000

Golden Member
Jun 23, 2012
1,542
780
136
To make the most of quad channel, they'd need to get the GPU clocks up. Hopefully if the process improves over time they can release a refresh.

The 384 shader IGPs are very close in performance to the 512 shader ones though,so I suspect even at current clockspeeds an increase in bandwidth would yield noticeable gains.
 
Last edited:

SiliconWars

Platinum Member
Dec 29, 2012
2,346
0
0
Kaveri's graphics would be ~25% faster than Richland at these clocks if neither were bandwidth bound.

7770 = 1GHz * 640 GCN shaders = 640,000
5770 = 850 * 800 VLIW shaders = 680,000

Kaveri's 512 GCN at 720 MHz = 368,640
Richland's 384 VLIW4 at 844 MHz = 324,096

Theoretically they should perform almost very closely but from here - http://anandtech.com/bench/product/1079?vs=1078 you can see that the 7770 is about 25-30% faster or thereabouts, so Kaveri should be around the same level faster if it weren't for the bandwidth issue.
 
Last edited:

SammichPG

Member
Aug 16, 2012
171
13
81
I'm underwhelmed by high end kaveri, pentium/i3 or athlon with external gpu make more sense, adding high clocked ddr3 gets expensive fast and you're still outperformed by cheaper alternatives.

Software support for HSA will arrive in years (maybe) and it's not a real selling point at this time.

They need ddr4 with 4 channels or some form of cache asap or their whole gpu tech superiority will be negated by intel bruteforcing its way with their superior process.
 

el etro

Golden Member
Jul 21, 2013
1,581
14
81
AMD would really need to consider a Triple or Quad Channel memory controller come the DDR4 Kaveri refresh if they want to unleash it's full potential.:ninja:

For the average consumer(The main target of today's APUs)? is not affordable.
AMD should improved their memory controller to do not have to appeal to put robust RAM channels in your systems in order to unleash full CPU power... they are still playing intel's game but insisting on less affordable and practical system configuration to deliver competitive performance against intel offerings.
AMD's luck is because 1866Mhz modules is going more mainstream-priced nowdays.

Kaveri's graphics would be ~25% faster than Richland at these clocks if neither were bandwidth bound.

7770 = 1GHz * 640 GCN shaders = 640,000
5770 = 850 * 800 VLIW shaders = 680,000

Kaveri's 512 GCN at 720 MHz = 368,640
Richland's 384 VLIW4 at 844 MHz = 324,096

Theoretically they should perform almost very closely but from here - http://anandtech.com/bench/product/1079?vs=1078 you can see that the 7770 is about 25-30% faster or thereabouts, so Kaveri should be around the same level faster if it weren't for the bandwidth issue.

Frequency blew down Kaveri GPU power, but i still want to believe they're using this 28nm process as a temporal workaround to GF 22/20nm delays. I not want them to wait this outdated 28nm process to mature more in order to help them make Kaveri IGP clock higher.
AMD needs too to give Kaveri notebook parts to OEM's as soon as they can.
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
To make the most of quad channel, they'd need to get the GPU clocks up. Hopefully if the process improves over time they can release a refresh.

05-Memory-Clock-Rate.png


In the case above, even the A8-7600 is benefiting from more memory bandwidth (50% increase in FPS (26--> 39) with 50% increase in memory clockrate (1600-->2400))
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
Some price analysis I did comparing 2 x 4GB DDR3-2400 1.65 volt kit vs. various 4 x 2GB DDR3 kits--> http://forums.anandtech.com/showpost.php?p=35948687&postcount=784

At this time for users of 8GB RAM, it appears the DDR3-1600, DDR3-2133 4 x 2GB kits would have been worth it.

Not sure what is going to happen in the future though. I expect the 4GB DDR3 DIMMs to drop in price much more the 2GB DDR3 DIMMs over time thus increasing the price gap and lowering the value of 4 x 2GB as a cheap means of increasing memory bandwidth.
 

Paul98

Diamond Member
Jan 31, 2010
3,732
199
106
Was Kaveri originally planned for faster ram speeds? If we had faster ram it would really bump up benchmarks for it. Or maybe it was testing out the chip before faster ram comes out to get full benefit from the APU?
 

monstercameron

Diamond Member
Feb 12, 2013
3,818
1
0
Was Kaveri originally planned for faster ram speeds? If we had faster ram it would really bump up benchmarks for it. Or maybe it was testing out the chip before faster ram comes out to get full benefit from the APU?

above 2400 get crazy high pricing.
 

el etro

Golden Member
Jul 21, 2013
1,581
14
81
Was Kaveri originally planned for faster ram speeds? If we had faster ram it would really bump up benchmarks for it. Or maybe it was testing out the chip before faster ram comes out to get full benefit from the APU?

Was tested before comes to mass production by AMD engineers with fast-as-hell memory setups(with four channels). Don't know why they cut this feature, maybe by integration(focus on smaller PC forms) aspects.
 

SiliconWars

Platinum Member
Dec 29, 2012
2,346
0
0
05-Memory-Clock-Rate.png


In the case above, even the A8-7600 is benefiting from more memory bandwidth (50% increase in FPS (26--> 39) with 50% increase in memory clockrate (1600-->2400))

That's just absolutely crazy. The graphics are so ridiculously bandwidth bound - I'm surprised they didn't go with even lower (core) clocks just to save power frankly.
 
Last edited:

know of fence

Senior member
May 28, 2009
555
2
71

"1920*1800" Resolution :eek:, BTW is this graph from the review? I couldn't find it.

I believe there is a way to misinterpret that graph, if you scrap the highest bar. It's basically a poster child for diminishing returns. 8-> 6- > 4-> 3 FPS every 267 MHz. We are used to seeing no improvements at all from simply system RAM OC with aDGPU, but I fairly certain diminishing returns is what you'd see once you vary the memory Frequency on any dGPU.

The engineering goal was to strike a balance between computing cores and bandwidth, not to reach the maximum bandwidth to saturate x amount of "cores". I don't know this for a fact, it's just common sense speaking, please correct me if I'm wrong.
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
"1920*1800" Resolution :eek:, BTW is this graph from the review? I couldn't find it.

http://translate.google.com/transla...eview-test,testberichte-241474-9.html&act=url

(It's from Tomshardware.de)

P.S. The resolution must be a mistake. (I'm sure they meant 1920 x 1080, not 1920 x 1800)

I believe there is a way to misinterpret that graph, if you scrap the highest bar. It's basically a poster child for diminishing returns. 8-> 6- > 4-> 3 FPS every 267 MHz.

Even at the DDR3-2133 (36 FPS) to DDR3-2400 (39 FPS) step the gains are still quite good. (Jumping from DDR3-2133 to DDR3-2400 is a 12.5% increase in bandwidth that yields a 8.3% increase in FPS. That is pretty amazing IMO.)
 

nismotigerwvu

Golden Member
May 13, 2004
1,568
33
91
Even at the DDR3-2133 (36 FPS) to DDR3-2400 (39 FPS) step the gains are still quite good. (Jumping from DDR3-2133 to DDR3-2400 is a 12.5% increase in bandwidth that yields a 8.3% increase in FPS. That is pretty amazing IMO.)

Agreed, and they very might have to dial back timings and whatnot to get the clocks up that high
 

know of fence

Senior member
May 28, 2009
555
2
71
Thanks for providing the link. I get it, your name I mean. The user is the biggest 'Computer Bottleneck'. But let me reiterate, a bottleneck isn't a good analogy for the relationship between graphics and bandwidth.

The R7 240 is a 30W TDP, 70 $ card, with the same 128bit bus as a dual channel Kaveri. So 128bit * 1.800GHz /8 bit/Byte = 28.8 GB/s. That's about a tenth of what the R9 290x has, but they have almost the same shader processor to bandwidth ratio.
http://en.wikipedia.org/wiki/Volcanic_Islands_(GPU_family)
Midrange cards on the other hand tend to be slightly more shifted towards more bandwidth, thus they are less bandwidth constrained.
Anyways if you tried to place a quad channel 2166 MHz chip into the Wiki-table we'd be at 69.3 GB/s and still not in reach of DDR5 discrete cards. Not to mention basically mandatory 4 DIMMs don't really jive with the whole HTPC idea.
That said, the longer I think about it the more appealing the idea of quad channel becomes (quad DDR1333,) but I'm not sure what that would do to power consumption.

Maybe with Mantle, hUMA and HSA we will see some savings in redundant copying.
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
The R7 240 is a 30W TDP, 70 $ card, with the same 128bit bus as a dual channel Kaveri. So 128bit * 1.800GHz /8 bit/Byte = 28.8 GB/s. That's about a tenth of what the R9 290x has, but they have almost the same shader processor to bandwidth ratio.
http://en.wikipedia.org/wiki/Volcanic_Islands_(GPU_family)

Yes. In fact, when adjusted for GPU core speed I would say the ratio is almost identical:

R7 240 DDR3: 320 stream processors @ 730/780 MHz, 28.8 GB/s memory bandwidth
R9 290X: 2816 stream processor, @ up to 1000 Mhz, 320 GB/s memory bandwidth
 
Last edited:

cbn

Lifer
Mar 27, 2009
12,968
221
106
Looking back at those 2 x 4 GB DDR3-2400 1.65 volt kits I am noticing they also have a much looser timings than the 4 x 2GB DDR3-1600 1.5 volt kits.

So if Kaveri would have came with a quad channel DDR3 memory controller..... for around $10 more (by Newegg prices), a person would have gotten 51.2 GB/s bandwidth vs 38.4 GB/s bandwidth, tighter timings of 9-9-9-24 vs. 11-13-13-35, lower volts @ 1.5 vs. 1.65 by going with 4x 2GB DDR3-1600 over 2 x 4GB DDR3-2400.
 
Last edited:

know of fence

Senior member
May 28, 2009
555
2
71
Yes. In fact, when adjusted for GPU core speed I would say the ratio is almost identical: 8.7 to 8.8

I'm just going to pretend that it's exactly what I meant. :cool:

Also to put things in perspective. Kaveri has 8 to 6 GPU compute cores, while the Xbox has 14 and PS4 has 20. With the Xbox running slightly higher clocks.
Only the Radeon R9 270 is comfortably above parity to consoles.

Even though AMD has to market Kaveri as 1080p capable, I suspect the big number of 16:10 monitors as well as HDready (1366x768) TVs out there, is where the best experience is to be had. I suspect bandwidth dependency will decrease along with the resolution.

The Techreport called Kaveri the gateway-drug to gaming, I see it as a great 'buy now hand down to relatives later' PC, until some worthy console ports arrive, or Intel starts selling bare-die CPUs for the desktop again.
 

Erenhardt

Diamond Member
Dec 1, 2012
3,251
105
101
Looking back at those 2 x 4 GB DDR3-2400 1.65 volt kits I am noticing they also have a much looser timings than the 4 x 2GB DDR3-1600 1.5 volt kits.

So if Kaveri would have came with a quad channel DDR3 memory controller..... for around $10 more (by Newegg prices), a person would have gotten 51.2 GB/s bandwidth vs 38.4 GB/s bandwidth, tighter timings of 9-9-9-24 vs. 11-13-13-35, lower volts @ 1.5 vs. 1.65 by going with 4x 2GB DDR3-1600 over 2 x 4GB DDR3-2400.

wouldn't 4x2GB 2400 MHz be better in quad channel aswell? 70+GB/s is around what 128bit GDDR5 provides.

AMD may want to try the memory compression solution from next nv cards generation, if it proves its worth it. But that is looking far into future.