Digital Foundry: next-gen PlayStation and Xbox to use AMD's 8-core CPU and Radeon HD

Page 33 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

BallaTheFeared

Diamond Member
Nov 15, 2010
8,115
0
71
How is the gaming workload for fp/integer vs. normal office use? - i would guess its more FP heavy but thats just an assumption?

They question is also, why have an 128bit fp pipeline in jaguar, as its quite a shift to bobcat? (still the core is 3.1mm2 sans l2, but isnt the FPU power expensive too?)

PC gaming is pretty much integer depended, that's why AVX2 is much more important to our segment than AVX could ever be (which was worthless).

I wonder if the cores in these consoles will support it, that would really help drive code implementation and design for PCs. :hmm:
 

itsmydamnation

Diamond Member
Feb 6, 2011
3,055
3,862
136
No, I mean Jaguar. Bobcat were decent amount behind Athlon 64, let alone Core 2.

http://www.planet3dnow.de/cgi-bin/newspub/viewnews.cgi?id=1361486916

Based on the performance and power figures, I think if we can normalize things it'd have similar perf/watt to the Samsung Exynos "Octa".

i dont read German but if those are true then jaguar has the same IPC as piledriver in Cine 11.5 single thread and higher in multi-thread. Which from what i can see is just a touch under core2(Q6600 scores around 4 @3.8ghz) IPC for cine 11.5.
 

NTMBK

Lifer
Nov 14, 2011
10,423
5,727
136
PC gaming is pretty much integer depended, that's why AVX2 is much more important to our segment than AVX could ever be (which was worthless).

I wonder if the cores in these consoles will support it, that would really help drive code implementation and design for PCs. :hmm:

Doubtful- standard Jaguar only supports as high as AVX. (And it's a 128 bit vector pipeline, so it won't get any real benefit from using AVX, like SSE on an Athlon 64.)
 

itsmydamnation

Diamond Member
Feb 6, 2011
3,055
3,862
136
two things, 1. games are FP heavy as well. how do you expect any physics with just int code. 2 who said you cant use a float for int code :whiste:.

Doubtful- standard Jaguar only supports as high as AVX. (And it's a 128 bit vector pipeline, so it won't get any real benefit from using AVX, like SSE on an Athlon 64.)

says who, if you have an instruction that isn't in SSE but is in AVX then there is a direct benifit.
 
Last edited:

NTMBK

Lifer
Nov 14, 2011
10,423
5,727
136
says who, if you have an instruction that isn't in SSE but is in AVX then there is a direct benifit.

That's true, yes, but the number of instructions which aren't just widened versions of existing SSE ones is fairly small. It'll be a fairly narrow use case, it's not like it has AVX2's gather operations or anything.
 

inf64

Diamond Member
Mar 11, 2011
3,884
4,691
136
Nice catch on the p3dnow news :). Jaguar is a whooping 32% faster than Bobcat in ST C11.5 subtest. When we went from K8 to K10(128bit fp pipes) we only saw ~15% performance jump in cinebench. PD has same ST performance but due to module sharing penalty it has 20% lower MT score,which ain't that bad considering it has only one FP unit which is shared between two cores(making its 1 fp pipeline 62% more efficient than Jaguar's 1 fp pipeline- PD 2 128bit SSE pipelines scoring 1.14pts Vs JG's 4 128bit SSE pipelines scoring 1.39 ).
 
Last edited:

itsmydamnation

Diamond Member
Feb 6, 2011
3,055
3,862
136
That's true, yes, but the number of instructions which aren't just widened versions of existing SSE ones is fairly small. It'll be a fairly narrow use case, it's not like it has AVX2's gather operations or anything.

yes but there is also the ability for a+b=c where as in SSE (correct me if im wrong) it has to be a+b=a|b. Also GPU's have been doing gather since like forever so with a single coherent memory space and fast interconnect you could just do them on the GPU.

Nice catch on the p3dnow news :). Jaguar is a whooping 32% faster than Bobcat in ST C11.5 subtest. When we went from K8 to K10(128bit fp pipes) we only saw ~15% performance jump in cinebench. PD has same ST performance but due to module sharing penalty it has 20% lower MT score,which ain't that bad considering it has only one FP unit which is shared between two cores(making it's 1 fp pipeline 62% more efficient than Jaguar's 1 fp pipeline- PD 2 128bit SSE pipelines scoring 1.14pts Vs JG's 4 128bit SSE pipelines scoring 1.39 ).
you will find thats all from the OOOE side of things, jaguar has like 16 entry deep FP scheduler bulldozer has 60. bulldozer likely has a better ( more power hungry) L/S system as well.


edit: remember from a power perspective executing operations is cheap, moving data is expensive.
 
Last edited:

NTMBK

Lifer
Nov 14, 2011
10,423
5,727
136
yes but there is also the ability for a+b=c where as in SSE (correct me if im wrong) it has to be a+b=a|b. Also GPU's have been doing gather since like forever so with a single coherent memory space and fast interconnect you could just do them on the GPU.

Ahh, of course, I had forgotten that AVX-128 has 3 operand operations. That should help a bit, although it has to be balanced with register usage.

As for doing gather on the GPU- yes, GPUs can do gather, but it doesn't really help the CPU. The point of gather is letting you perform vector ops on chunks of memory that aren't arranged nicely - feasibly the GPU could gather memory, write it back as a coherent lump and then let the CPU work on it, but the latencies involved would render it worthless.
 

beginner99

Diamond Member
Jun 2, 2009
5,315
1,760
136
i still think that 1 piledriver module + 4 jaguar cores is the best...

2 big fat cores, for game and rendering
4 jaguar for OS, audio and etc...

similar to the ARM approach

exactly. I agree. a jaguar "Module" is 4 cores. Maybe the 2 modules used have different clocks, eg 1 fast for games, one slower for background stuff. but I highly doubt that.

And I think it was pretty obvious I meant that RTS games are not suited for multi-threading.

Importance of single-thread performance has diminished on PCs, too

Only for niche segments like "real" gaming (eg. not farmville and such) and encoding (the most common ones). I would say at max. 10% of computers (laptop and desktop) are used for that regularly. The rest is better of with high single-threaded IPC + fast IO (=ssd).

Why should a FPS multi-player be better suited for multi-threading than a RTS?
 

krumme

Diamond Member
Oct 9, 2009
5,956
1,595
136
Nice catch on the p3dnow news :). Jaguar is a whooping 32% faster than Bobcat in ST C11.5 subtest. When we went from K8 to K10(128bit fp pipes) we only saw ~15% performance jump in cinebench. PD has same ST performance but due to module sharing penalty it has 20% lower MT score,which ain't that bad considering it has only one FP unit which is shared between two cores(making its 1 fp pipeline 62% more efficient than Jaguar's 1 fp pipeline- PD 2 128bit SSE pipelines scoring 1.14pts Vs JG's 4 128bit SSE pipelines scoring 1.39 ).

I think its incredible that AMD have a 3.1mm2 core with same ST cinebench as PD with its beefed up frontend - if true. Both product are made from the same company, where BD/PD probably took at least 10-20 times the ressources as bobcat/jaguar to develop. I know there is frequency also and more to it, but still !

But hey probably our own daily work is about the same; a few task values far more than the rest of the work we do.
 

Olikan

Platinum Member
Sep 23, 2011
2,023
275
126
exactly. I agree. a jaguar "Module" is 4 cores. Maybe the 2 modules used have different clocks, eg 1 fast for games, one slower for background stuff. but I highly doubt that.

hehe...i was thinking about 1 module of piledriver, aka is 2 cores + 1fpu + shared stuffs :p

but your idea seems better, if possible :p....jaguar is not designed for fast clocks
 

itsmydamnation

Diamond Member
Feb 6, 2011
3,055
3,862
136
As for doing gather on the GPU- yes, GPUs can do gather, but it doesn't really help the CPU. The point of gather is letting you perform vector ops on chunks of memory that aren't arranged nicely - feasibly the GPU could gather memory, write it back as a coherent lump and then let the CPU work on it, but the latencies involved would render it worthless.

i meant just do the whole operation, given that the GPU will still have its own LDS/L1 and L2 i dont really see the point of trying to get the GPU to accelerate the CPU. On the other hand CPU accelerating complex parts GPU shader code could allow some really cool stuff we just wont see on high end PC's.
 

Fox5

Diamond Member
Jan 31, 2005
5,957
7
81
For those people saying PS4 should have used piledriver:
Jaguar is synthesizable, piledriver is not. That means Jaguar can be easily adapted to custom or semi custom designs, while getting anything other than a standard piledriver chip would be a massive design effort. Jaguar is the core that AMD was always intending to sell in this type of situation, Piledriver is a performance focused core intended for a different audience.
 

Olikan

Platinum Member
Sep 23, 2011
2,023
275
126
For those people saying PS4 should have used piledriver:
Jaguar is synthesizable, piledriver is not. That means Jaguar can be easily adapted to custom or semi custom designs, while getting anything other than a standard piledriver chip would be a massive design effort. Jaguar is the core that AMD was always intending to sell in this type of situation, Piledriver is a performance focused core intended for a different audience.

yeah... i really forgot that :|

at the end of the day, PS4 is looking like a really cheap console (if we look at PS3)
 

NTMBK

Lifer
Nov 14, 2011
10,423
5,727
136
For those people saying PS4 should have used piledriver:
Jaguar is synthesizable, piledriver is not. That means Jaguar can be easily adapted to custom or semi custom designs, while getting anything other than a standard piledriver chip would be a massive design effort. Jaguar is the core that AMD was always intending to sell in this type of situation, Piledriver is a performance focused core intended for a different audience.

Not to mention, Jaguar should be easier to port to new process nodes (i.e. shrink the die), which is always a priority for consoles.
 

itsmydamnation

Diamond Member
Feb 6, 2011
3,055
3,862
136
yeah... i really forgot that :|

at the end of the day, PS4 is looking like a really cheap console (if we look at PS3)


limits have changed that is all, when the PS3 launched we hadn't hit the power wall. But at the same time memory was very expensive thus we only got 512mb. We also got a CPU that was incredibly anaemic, your talking 5-6 times the IPC with a jaguar core and because they are CISC not RISC instructions they are far more complex. it also doesn't have any of the insane register read/write/copy limitations etc.

I really dont get the gloom of some people. PS3 outputed around 200watts at launch looks like the PS4 will do about the same.
 

slayernine

Senior member
Jul 23, 2007
894
0
71
slayernine.com
I definitely look forward to increased memory and processors with more modern extensions in the new consoles.

My only concern right now is that PC games with horrible memory leaks will just become more prevalent with the rise of 8GB consoles.
 

Olikan

Platinum Member
Sep 23, 2011
2,023
275
126
limits have changed that is all, when the PS3 launched we hadn't hit the power wall. But at the same time memory was very expensive thus we only got 512mb. We also got a CPU that was incredibly anaemic, your talking 5-6 times the IPC with a jaguar core and because they are CISC not RISC instructions they are far more complex. it also doesn't have any of the insane register read/write/copy limitations etc.

I really dont get the gloom of some people. PS3 outputed around 200watts at launch looks like the PS4 will do about the same.

cheap to produce, i meant :p

the only expensive part here is the 8gb@5.5ghz gddr5... but i suspect that it is not THAT much costly than XDR + ddr3....
sony just have to deal with one company, instead of 2...and a lesser complexity
 

2is

Diamond Member
Apr 8, 2012
4,281
131
106
if a PS3 that's 8 years old can play Crysis 3 and look that good, yes a 7970. In 8 years (2020) can a 7970 play a state of the art game? Probably just barely if at all, hence why i'd say a PS4 is as powerful as a 7950/7970. Think about it, a Geforce 7900GTX or a Radeon X1900 XTX were state of the art in 2006, but no way can they play Crysis 3 @ the same level of detail as a PS3.

We'll have to agree to disagree. Obviously an 8 year old PC can't play crysis 3 because on PC you have to have a DX11 card. I think comparing an APU to a 7970 is more laughable then the quote in my sig personally. Your comparisons are flawed, one reason I just mentioned, the other I mentioned previously, they aren't being rendered at the same resolutions. The other is viewing distance. Sit as close as you do when you're gaming in a PC and you a whole lot more LACK of detail. Another reason is PS4 is x86 based, meaning games ported to PC will be far more efficient than the ones ported to PC currently.

But hey, if you want to think optimization will make an APU just as powerful as a GPU that's on order of magnitude more powerful, go for it.
 

2is

Diamond Member
Apr 8, 2012
4,281
131
106
Poor choice of words. So only twice as fast? Point stands, its no where near a 7970
 

Olikan

Platinum Member
Sep 23, 2011
2,023
275
126
Poor choice of words. So only twice as fast? Point stands, its no where near a 7970

the funny part is that, an 7850 is faster than many popular cards today
it's actually faster than 70% of all steam users :p
 

blackfallen

Junior Member
Apr 1, 2011
16
0
0
Alright I can see that its faster then a 7970.. we all new the power draw was too much but think about this. The GPU in the PS4 is slightly faster then the 7850 BUT the 7850 is faster then the HD 5870, HD6970... I don't understand why people are complaining here we thought we were getting nothing faster then a HD7670.

Let alone the HD5870, HD6970 are leaps and bounds faster then the ps3, so be happy we are getting something faster!
 

2is

Diamond Member
Apr 8, 2012
4,281
131
106
Alright I can see that its faster then a 7970.. we all new the power draw was too much but think about this. The GPU in the PS4 is slightly faster then the 7850 BUT the 7850 is faster then the HD 5870, HD6970... I don't understand why people are complaining here we thought we were getting nothing faster then a HD7670.

Let alone the HD5870, HD6970 are leaps and bounds faster then the ps3, so be happy we are getting something faster!

I'm not complaining at all. I'm actually impressed by the power they were able to pack into that APU. And while I agree that consoles are able to do more with less compared to PC's due to lower overhead and optimizations, I still think some people are OVER-estimating it's potential.