APU13- AMD Kaveri details- 856Gflops, 3.7Ghz CPU,720Mhz GPU

Page 5 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Status
Not open for further replies.

monstercameron

Diamond Member
Feb 12, 2013
3,818
1
0
Fair enough.
The way you wrote that seemed like two connected thoughts.

yep I should try to be more clear...

any thoughts on it though? going from javascript -> opencl

i know you can go c/c++ to javascript via emscripten but what about the other way around? [not webcl ofcourse]
 

Erenhardt

Diamond Member
Dec 1, 2012
3,251
105
101
Now my memory may be faulty, but I am pretty sure a few months ago I expressed extreme skepticism that Kaveri's IGP would be faster than a ATI HD5770, and there were a few regular posters here who claimed that I was wrong and that Kaveri would easily beat a HD5770. :whiste:

From the BF4 chart I posted earlier:
7750:21FPS
5770:15FPS

So when bandwidth is not a bottleneck, 512 GCN cores @800MHz are considerably faster than 5770.

So, I wouldn't dance the victory dance quite yet ;)
 

SPBHM

Diamond Member
Sep 12, 2012
5,068
423
126
From the BF4 chart I posted earlier:
7750:21FPS
5770:15FPS

So when bandwidth is not a bottleneck, 512 GCN cores @800MHz are considerably faster than 5770.

So, I wouldn't dance the victory dance quite yet ;)

"when bandwidth is not a bottleneck"

it's going to be for most games with this GPU and 128bit DDR3.
http://www.hardware.fr/focus/76/amd-radeon-hd-7750-ddr3-test-cape-verde-etouffe.html

also this is a single game, the 5770 is as fast as the 7750 GDDR5 in many others http://www.techpowerup.com/reviews/ASUS/HD_7750/26.html... although with the low tessellation performance from the 5000 series, and probably not comparable software optimization anymore, I would expect the 7750 GDDR5 to be looking better now.

overall I would expect the 5770 to be ahead of Kaveri, but under some conditions it could be behind...

I think the closest competition is going to be the GT 640 (DDR3)
 

Erenhardt

Diamond Member
Dec 1, 2012
3,251
105
101
It depends on the settings mostly. 5770 will be faster than kaveri in fhd and higher resolutions, where bandwidth is the most important.

But when it comes to the core speed, 5770 is about the same as 7750

We can't discuss memory too much. We don't know what changes they made. How will it impact performance, etc. But it requires a miracle to match 76GB/s of 5770 with ddr3 sticks.

Here is a difference between 512GCN (819GLOPS) and 480VILW (624 GFLOPS) with gddr3 bottleneck, which translates (not exactly) into kaveri vs Richland performance difference
 
Last edited:

itsmydamnation

Diamond Member
Feb 6, 2011
3,086
3,929
136
It's called making a prediction.

Not so dissimilar to inf64 predicting that Bulldozer would be the fastest desktop processor money could buy, accept that I will be on the money, unlike him. :cool:

GCN has far better ultisation then VLIW, they also made major caching changes on GCN and we dont know how big the L2 is on the GPU. Add in HSA to reduce off chip memory writes been CPU/GPU and I wouldn't be anywhere near as cocky as your being. Just look at games that use compute shaders to see how much better "flop per flop" gcn is then VLIW.
 
Aug 11, 2008
10,451
642
126
The comparison to me is the HD7750 DDR5, which itself is already 2 years old. The problem I see is still the same one that APUs have always had: better gpu performance than the casual user needs, but inferior to a low end cpu with AMDs own low end discrete card for gaming or other gpu intensive users at very similar prices.
 
Nov 2, 2013
105
2
81
yep I should try to be more clear...

any thoughts on it though? going from javascript -> opencl

i know you can go c/c++ to javascript via emscripten but what about the other way around? [not webcl ofcourse]

Not a clue.

Probably a question better asked over at /r/programming or some such place.

A cursory google led only to WebCl.
 

Vesku

Diamond Member
Aug 25, 2005
3,743
28
86
The reduction from estimated Gflops of 1050 makes me think one of the following may be true:

More TDP was dedicated to the CPU side than originally planned because either Steamroller is a bit weaker than desired or the GF 28nm process is less than ideal.

OR

The top SKUs will see a reduction in TDP from 100W to something like ~80-85W.

OR

The memory controller is not much improved over Richland and pushing the iGP further is generally wasteful even if it results in higher theoretical flops.
 

Meekers

Member
Aug 4, 2012
156
1
76
I don't think this is accurate, what "games" were you playing that at 720P and what skus of llano or even richland?

I used the top part both times.

Llano: a8-3850
Richland: a10-6800k

Games I tried with Llano that it really struggled with at low setting were GW2 and Civ5. It could play the games but in GW2 if too much was going on frames maxed at 30. Civ5 turn times were so long you could go to the bathroom and come back and it was still waiting. It did manage to run less intensive games just fine like Orcs Must Die and Hellgate London.

Also these computers are at my business so I did not really have time to use them for extended periods. I will say I have been very happy with how the Richland computer has performed when I have got a chance to use it. I also used 1600 ram for both builds as gaming was not their focus.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
146
106
The reduction from estimated Gflops of 1050 makes me think one of the following may be true:

More TDP was dedicated to the CPU side than originally planned because either Steamroller is a bit weaker than desired or the GF 28nm process is less than ideal.

OR

The top SKUs will see a reduction in TDP from 100W to something like ~80-85W.

OR

The memory controller is not much improved over Richland and pushing the iGP further is generally wasteful even if it results in higher theoretical flops.

OR

It was all they could bin them for within 100W TDP.
 

Vesku

Diamond Member
Aug 25, 2005
3,743
28
86
OR

It was all they could bin them for within 100W TDP.

More TDP was dedicated to the CPU side than originally planned because either Steamroller is a bit weaker than desired or the GF 28nm process is less than ideal.

Pretty much covers being reined in by process or lower than expected performance at 100W TDP.
 

CHADBOGA

Platinum Member
Mar 31, 2009
2,135
833
136
GCN has far better ultisation then VLIW, they also made major caching changes on GCN and we dont know how big the L2 is on the GPU. Add in HSA to reduce off chip memory writes been CPU/GPU and I wouldn't be anywhere near as cocky as your being. Just look at games that use compute shaders to see how much better "flop per flop" gcn is then VLIW.

Hmmmmmmmmmmmm, HSA. :awe:
 

inf64

Diamond Member
Mar 11, 2011
3,884
4,692
136
What I find interesting is that two key points from previous leaks that were labeled either fake or a big downgrade ( TrueAudio being "fake" and not in Kaveri and Kaveri running at much lower clock speeds than BD/PD) are busted :D.

Facts:
Kaveri has TrueAudio block on-die. It's not fake, it's the same stuff AMD put in the Hawaii GPUs.
AMD's claim that they will "maintain high frequency" engine also are true. That one SKU we saw in the footnote (which IMO might not even be the "top A10" model ;) ) is running at 3.7Ghz base clock, which is 100Mhz lower than 5800K, launched Oct 1 2012 (a year ago). The pipeline obviosuly can take the clock to ~4+ Ghz range and it's nice AMD has some room to push for more performance, especially on the GPU side.

It looks good for AMD. Kaveri will be good improvement on both x86 and GPU fronts, keeping them afloat in the mainstream desktop segment. If Kaveri can be paired up with GCN cards and maintain good frame latency with the newest Catalyst drivers in Hyrbid CF, then AMD will just rule the mainstream gaming segment. They can even bundle Kaveri,FM2+ board, their own branded memory and a GCN card and make a great deals on newegg or similar places.

And then in 2015, new socket, new DDR standard support, new core(just in time for broadwell) and new GPU on-die with even more SPs and more HSA capability. If we are lucky we might get 768SPs on the iGPU. Also by that time Carrizo, based on Excavator core, will most likely be strong enough even with just 2 modules( that will be considerably stronger IPC wise than BD/PD) to push even CF of highest end GFX cards without any issues. Fun times ahead :)
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,362
136
Hmmmmmmmmmmmm, HSA. :awe:


AMD Announces New Unified SDK, Tools and Accelerated Libraries for Heterogeneous Computing Developers


also,

AMD Enables Server APU Software to Reimagine the Server

Project Sumatra – a joint Oracle and AMD project done in open source that enables developers to code in Java and take advantage of GPU compute;
GCC/HSA Project – an AMD and SUSE project to enable the popular open source Linux compiler, GCC, to support HSA, targeting OpenMP APIs;
PGI Accelerator™ Compiler – a beta version is available that enables developers to add OpenACC directives that support AMD APUs and discrete GPUs to Windows and Linux Fortran, C and C++ programs;
clMath – AMD OpenCL math libraries that were contributed to open source in August enable developers to accelerate common scientific and engineering computations on AMD APUs and discrete GPUs;
ArrayFire 2.0 for OpenCL – a fast math library by AccelerEyes that utilizes clMath for GPU computing and offers an easy-to-use API for Windows or Linux developers;
CodeXL 1.3 – AMD’s comprehensive developer tool suite for Windows and Linux that features remote debugging and profiling to enable server application developers.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
146
106
What I find interesting is that two key points from previous leaks that were labeled either fake or a big downgrade ( TrueAudio being "fake" and not in Kaveri and Kaveri running at much lower clock speeds than BD/PD) are busted :D.

Facts:
Kaveri has TrueAudio block on-die. It's not fake, it's the same stuff AMD put in the Hawaii GPUs.
AMD's claim that they will "maintain high frequency" engine also are true. That one SKU we saw in the footnote (which IMO might not even be the "top A10" model ;) ) is running at 3.7Ghz base clock, which is 100Mhz lower than 5800K, launched Oct 1 2012 (a year ago). The pipeline obviosuly can take the clock to ~4+ Ghz range and it's nice AMD has some room to push for more performance, especially on the GPU side.

So now we pretend the 6800K doesnt exist? You know, with a 4.1Ghz baseclock. But whats a 400Mhz drop between friends.

So lets see:
No 3M/6T models.
No quadchannel.
Slower IGP clock than expected.
Slower CPU clock than expected.

Did I miss anything? Besides the usual nonsense about the next AMD uarch will really beat Intel this time?

Or are we already over in the "next product will really perform" era?
http://semiaccurate.com/forums/showpost.php?p=194466&postcount=384
 
Last edited:

Gloomy

Golden Member
Oct 12, 2010
1,469
21
81
I'm baffled by its performance with BF4 @ 1080p. That's twice what I expect from 512GCN cores running on DDR3.

It is running on DDR3, right?
 

Unoid

Senior member
Dec 20, 2012
461
0
76
So now we pretend the 6800K doesnt exist? You know, with a 4.1Ghz baseclock. But whats a 400Mhz drop between friends.

So lets see:
No 3M/6T models.
No quadchannel.
Slower IGP clock than expected.
Slower CPU clock than expected.

Did I miss anything? Besides the usual nonsense about the next AMD uarch will really beat Intel this time?

Or are we already over in the "next product will really perform" era?
http://semiaccurate.com/forums/showpost.php?p=194466&postcount=384

Am I missing something? I'm here for APU13 related Kaveri news.

Not biased opinionated talking heads intending to flamebait. Mods please clean up these tech forums. We're here for tech news, not a banter battleground over how much cache a chip has. :$
 

inf64

Diamond Member
Mar 11, 2011
3,884
4,692
136
@Shintai

Richland for desktop arrived on market in June this year, just a headsup since you missed it. That's very recently. AMD had 2.5 years to work on 32nm node to get to 4.1Ghz base clock on the APU product line. Kaveri is just 10% slower clock wise, that's peanuts considering they are working with a new uarchitecture with big core changes and done on a new node (no big core from AMD was done yet on this node before SR ;) ). It's a testament to their design team, they have managed 90% of super tweaked PD frequency range straight on launch while being on a new node.


3M was cancelled early on (1.5 years ago). It was in the 1st version of SOG for SR.
Quad channel for GDDR5 (4x32bit) is there, taking die space ;). This won't go away as its part of the IMC. It's not a huge die are that is wasted tho so no biggie :).
Slower IGP clock running GFLops measurement? Hawaii has up to 1Ghz clock range and it usually runs games on clocks that sit between 900 and 950Mhz ;). AMD has a new TurboCore version that is even better now in deciding what needs more performance and when. 720Mhz can become 800+ Mhz in games and 3.7Ghz on CPU may become 4Ghz in non-prime/powervirus type of software while GPU is not stressed too much (so power budget can be redirected to CPU side).
CPU base clock is 90% of Richland milked to the max ;). That is great. Room to grow also, AMD can crank it up and get "free" 10-15% performance if need be (akin to 5800K->6800K).

You missed a lot it seems. Keep reading :)
 

Homeles

Platinum Member
Dec 9, 2011
2,580
0
0
Am I missing something? I'm here for APU13 related Kaveri news.

Not biased opinionated talking heads intending to flamebait. Mods please clean up these tech forums. We're here for tech news, not a banter battleground over how much cache a chip has. :$
Let me get this straight: we're supposed to just post news, and not discuss it? That's a pretty odd view to hold, given that this is a forum, and all.
 

Vesku

Diamond Member
Aug 25, 2005
3,743
28
86
If you lower settings enough...

Battlefield 4 1080P at Medium according to the AMD presenter. Which falls into line with what the standalone 7750 DDR3 can do:

http://www.hardware.fr/focus/76/amd-radeon-hd-7750-ddr3-test-cape-verde-etouffe.html

24FPS Battlefield 3 1080P Medium Radeon 7750 DDR3 + stock i7-3770K

I'd imagine the card is bandwidth starved at Medium, so doesn't reveal anything regarding the CPU side of Kaveri. Especially since they didn't show a combat scene.
 
Last edited:

monstercameron

Diamond Member
Feb 12, 2013
3,818
1
0
@Shintai

snip


3M was cancelled early on (1.5 years ago). It was in the 1st version of SOG for SR.
Quad channel for GDDR5 (4x32bit) is there, taking die space ;). This won't go away as its part of the IMC. It's not a huge die are that is wasted tho so no biggie :).

snip

what is your source on the gddr5 controller?
 

Ajay

Lifer
Jan 8, 2001
16,094
8,114
136
AMD claimed 1050 Gflops for Kaveri months ago. 856 Gflops is much lower than expected.

Yeah, I'm guessing the bandwidth just isn't their to push the iGPU faster. Other than that o_O.

Pretty happy to see they are able to hit 3.7 GHz. AMD will need a solid 15% gain in IPC to best Richland in raw ST performance though (I think MP performance will easily be better - finally).
 
Status
Not open for further replies.