[benchmarks] Need for Speed 2016

Page 3 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

thesmokingman

Platinum Member
May 6, 2010
2,307
231
106
Don't start jumping to conclusions. They optimize the memory to take advantage of poor memory usage by game engines at hi resolution. How much, how far, and what limitations there are to that are unknown.
 

Bacon1

Diamond Member
Feb 14, 2016
3,430
1,018
91
The reason Fiji takes more work is because so many games go nuts with memory usage because GDDR5 isn't fast enough, so they just pack more room instead. Look at memory usage in Tomb Raider, its nuts and its a game that does it better than others. HBM can take advantage of system memory to store all the junk that games try to load that they don't actually need, and can swap it in/out as needed.

I mean Crysis ran on cards with only 1GB of ram or less, yet games these days take 4GB+...

Only at the high 2560x1600 resolution is the 512MB card experiencing a slight performance penalty

Developers stopped optimizing and instead we just throw better/bigger hardware at it.
 
Feb 19, 2009
10,457
10
76
Don't start jumping to conclusions. They optimize the memory to take advantage of poor memory usage by game engines at hi resolution. How much, how far, and what limitations there are to that are unknown.

It's happened way too many times to be a coincidence, both in DX11 and DX12 games.

Fiji does need optimized drivers to perform well, while older GCN SKUs seem to run well out of the box.
 

thesmokingman

Platinum Member
May 6, 2010
2,307
231
106
It's happened way too many times to be a coincidence, both in DX11 and DX12 games.

Fiji does need optimized drivers to perform well, while older GCN SKUs seem to run well out of the box.


You don't know what's what nor do I or anyone else for that matter. We do however know what they've said that they done which as bacon wrote is to manage the WASTE better which is something no one does atm. I think after DX12 has had more time to grow it will not be an issue and ppl will forget if they even remembered.
 
Feb 19, 2009
10,457
10
76
You don't know what's what nor do I or anyone else for that matter. We do however know what they've said that they done which as bacon wrote is to manage the WASTE better which is something no one does atm. I think after DX12 has had more time to grow it will not be an issue and ppl will forget if they even remembered.

Going from what AMD have said, then yes, HBM requires optimized memory usage and so Fiji requires game ready drivers to perform.

That is a concern, because AMD is often not game ready until post release. Hopefully they can sort out the kinks so Polaris won't suffer without game specific optimized drivers.
 

thesmokingman

Platinum Member
May 6, 2010
2,307
231
106
Going from what AMD have said, then yes, HBM requires optimized memory usage and so Fiji requires game ready drivers to perform.

That is a concern, because AMD is often not game ready until post release. Hopefully they can sort out the kinks so Polaris won't suffer without game specific optimized drivers.


Great make stuff up now this is going to be the next troll by the group of you know whos.
 

JDG1980

Golden Member
Jul 18, 2013
1,663
570
136
It is probably a "feature" of HBM, and so Polaris or Pascal HBM2 SKU will likely need this optimization too.

I doubt that there's anything specific about HBM that requires it. More likely, what we're seeing with Fiji is a result of its highly unbalanced design.

From TechReport:

In other respects, including peak triangle throughput for rasterization and pixel fill rates, Fiji is simply no more capable in theory than Hawaii. As a result, Fiji offers a very different mix of resources than its predecessor. There's tons more shader and computing power on tap, and the Fury X can access memory via its texturing units and HBM interfaces at much higher rates than the R9 290X.


In situations where a game's performance is limited primarily by shader effects processing, texturing, or memory bandwidth, the Fury X should easily outpace the 290X. On the other hand, if gaming performance is gated by any sort of ROP throughput—including raw pixel-pushing power, blending rates for multisampled anti-aliasing, or effects based on depth and stencil like shadowing—the Fury X has little to offer beyond the R9 290X. The same is true for geometry throughput.

Fiji really should have more ROPs. But they ran out of room on the die and couldn't fit it. That isn't just speculation, Raja Koduri said so:

The reason why Fiji isn't any larger, he said, is that AMD was up against a size limitation: the interposer that sits beneath the GPU and the DRAM stacks is fabricated just like a chip, and as a result, the interposer can only be as large as the reticle used in the photolithography process. (Larger interposers might be possible with multiple exposures, but they'd likely not be cost-effective.) In an HBM solution, the GPU has to be small enough to allow space on the interposer for the HBM stacks. Koduri explained that Fiji is very close to its maximum possible size, within something like four square millimeters.

Hawaii is effective right off the bat without game-specific optimizations because it is one of AMD's most well-balanced chips.

Tahiti and Tonga also suffer from having too few ROPs per shader (in fact, the ratio is the same as in Fiji). I don't understand why they didn't fix this in Tonga. They could have cut out the useless extra 64 bits of memory controller that was never used on any shipping configuration, and used the space to enhance the ROP count from 32 to 48. (Or is there something about GCN that requires them to be added in powers of 2? Nvidia doesn't have that limitation, since GM200 has 96 ROPs.) That would have made a big difference.

Once we get 14nm Polaris GPUs, I don't think we'll see these oddball problems because they'll have plenty of room to fit a decent front end.
 
Feb 19, 2009
10,457
10
76
Once we get 14nm Polaris GPUs, I don't think we'll see these oddball problems because they'll have plenty of room to fit a decent front end.

It's all speculation right now (as to why), but Fiji looks awful without game ready drivers from AMD. Call it what it is, but if you were a Fiji owner, you would not be impressed either. ;)

I hope this is a Fiji issue and not a HBM issue, we'll just have to wait and see!
 

JDG1980

Golden Member
Jul 18, 2013
1,663
570
136
It's all speculation right now (as to why), but Fiji looks awful without game ready drivers from AMD. Call it what it is, but if you were a Fiji owner, you would not be impressed either. ;)

Agreed. I see Fiji as a tech demo that was released simply because AMD had nothing else new to offer. Things would have been very different if 20nm hadn't shit the bed.
 
Feb 19, 2009
10,457
10
76
LOL ain't that the truth. Look at that poor 780ti flagship. Can't even follow in the shadows of the 970 with any dignity at all. Pathetic. The entire rest of the Kepler line up is so terrible they aren't even worth listing lol. Those are basically R9 290 vs 780ti results in those charts. If anyone would have known back then what we know now, no one would have touched Kepler. They would have rotted on the store shelves.

When you put it like that...

tGub8uH.jpg


That's basically a non-reference 290 at stock clocks vs a factory OC 780Ti. o_O

How the mighty Kepler has fallen, every single AAA title in 2016 so far, it's been having a dirt nap.

Hawaii, made to compete against 780, Titan and 780Ti, now bests Maxwell 970/980 in modern games, and on the heels of the big GM200 in DX12.

Hawaii chip => Legendary status earnt!

If it keeps performing this well in new titles, is there any incentive for users to upgrade to Polaris? Those are more than playable even at 1440p.
 

tential

Diamond Member
May 13, 2008
7,355
642
121
Now Hawaii is competing with Fiji. Great. It's like it never stops.
Next thing you know, Hawaii will be optimized to beat Polaris and Pascal :(
Obviously I'm just kidding by the way...
 
Feb 19, 2009
10,457
10
76
Now Hawaii is competing with Fiji. Great. It's like it never stops.
Next thing you know, Hawaii will be optimized to beat Polaris and Pascal :(
Obviously I'm just kidding by the way...

Kid or not, Fiji is pretty crap without special drivers for it. I'm just glad I didn't waste $ on it hehe.

Come next gen, since I'm not at 4K, why do I even need to upgrade to Polaris, my R290X OCs to 1.2ghz to get faster than 390X performance and that's still enough for 1440p. -_-

Guess its 4K time?
 

Mahigan

Senior member
Aug 22, 2015
573
0
0
I doubt that there's anything specific about HBM that requires it. More likely, what we're seeing with Fiji is a result of its highly unbalanced design.

From TechReport:



Fiji really should have more ROPs. But they ran out of room on the die and couldn't fit it. That isn't just speculation, Raja Koduri said so:



Hawaii is effective right off the bat without game-specific optimizations because it is one of AMD's most well-balanced chips.

Tahiti and Tonga also suffer from having too few ROPs per shader (in fact, the ratio is the same as in Fiji). I don't understand why they didn't fix this in Tonga. They could have cut out the useless extra 64 bits of memory controller that was never used on any shipping configuration, and used the space to enhance the ROP count from 32 to 48. (Or is there something about GCN that requires them to be added in powers of 2? Nvidia doesn't have that limitation, since GM200 has 96 ROPs.) That would have made a big difference.

Once we get 14nm Polaris GPUs, I don't think we'll see these oddball problems because they'll have plenty of room to fit a decent front end.
GCN 1.1 and 1.2 have a maximum of:

4 ROPs per RBE, 4 RBEs per Shader Engine and 4 Shader Engines.

4(4x4)=64 ROps.

For Fiji, AMD would have had to add more shader engines which would in turn require more L2 cache in order to maintain the 1:1 performance scaling ratio.

This wasn't possible on 28nm for the reasons Raja Mentioned.

For Tonga, AMD could have moved to 3 RBEs per shader engine.

4(3x4)=48 ROps.

They probably didn't do this due to the die size and price point the GPUs were aimed at.

As for Rasterizers, 1 Rasterizer per Shader Engine so 4 Triangle Ops per clock. For Fiji that's 4.2 Gtris/s.

Something tells me that AMD will be moving to 6 shader engines at some point in the near future.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
Going from what AMD have said, then yes, HBM requires optimized memory usage and so Fiji requires game ready drivers to perform.

That is a concern, because AMD is often not game ready until post release. Hopefully they can sort out the kinks so Polaris won't suffer without game specific optimized drivers.

AMD said they need to optimize for the 4GB not to be a limitation (2 engineers working on this). It has absolutely nothing to do with HBM. It could just as well have been GDDR5, DDR4 or whatever.
 
Feb 19, 2009
10,457
10
76
AMD said they need to optimize for the 4GB not to be a limitation (2 engineers working on this). It has absolutely nothing to do with HBM. It could just as well have been GDDR5, DDR4 or whatever.

Sorry but 4GB is not a reason for under-performance when the R290/X performs so well and it's 4GB. Plus these are at resolutions where 4GB vram isn't a limitation, some of these games don't even saturate it.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
Sorry but 4GB is not a reason for under-performance when the R290/X performs so well and it's 4GB. Plus these are at resolutions where 4GB vram isn't a limitation, some of these games don't even saturate it.

Its already mentioned before. Fiji is an unbalanced chip. It got nothing to do with HBM. If you are ROP limited, then Fiji is really just a Hawaii.

GTX980 isn't beating it because its faster overall. But because in this case it got faster clocked ROPs.
 
Last edited:

caswow

Senior member
Sep 18, 2013
525
136
116
i actually think fiji with hbm is more of a prove of concept and some kind of pipe cleaner. the only woth is nano rest is nice to have nothing more.
 

xthetenth

Golden Member
Oct 14, 2014
1,800
529
106
Its already mentioned before. Fiji is an unbalanced chip. It got nothing to do with HBM. If you are ROP limited, then Fiji is really just a Hawaii.

GTX980 isn't beating it because its faster overall. But because in this case it got faster clocked ROPs.

That accounts for the cases where Fiji starts slow and stays slow, but there are also cases where it starts slow and gets a driver that improves its speed.
 

Erenhardt

Diamond Member
Dec 1, 2012
3,251
105
101
My take on Fiji is this: Fiji is like a GCN fenomenom inside of a GCN fenomenom.
GCN is maturing. It took time for console optimizations to benefit card that share the architecture.
Fiji has unique memory that needs the specific treatment to give full benefit. My guess is, when time comes, Fiji with HBM will get the second wind. But for that, we need more HBM based cards, so that developers have a userbase to target the optimizations for.
 

IEC

Elite Member
Super Moderator
Jun 10, 2004
14,330
4,917
136
Tied in 3440x1440. Point?

You can buy a Radeon 390 for $275 USD today (and just about every other week, it seems). A GTX 980 costs $440 on sale.

A 390 is supposed to be a tier and a half below the 980, but in recent games appears to come close, match, or even beat a 980 in performance.
 

crisium

Platinum Member
Aug 19, 2001
2,643
615
136
36% lead for the Fury X over the 290X at 1440p, but I'm fairly certain that's a reference 290X and a 390X would be close to 10% faster than that. At least the Nano is a decent step over the 290X unlike pcgameshardware where it's 1FPS faster than a 390.