• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

[benchmarks] Need for Speed 2016

Page 3 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Don't start jumping to conclusions. They optimize the memory to take advantage of poor memory usage by game engines at hi resolution. How much, how far, and what limitations there are to that are unknown.
 
The reason Fiji takes more work is because so many games go nuts with memory usage because GDDR5 isn't fast enough, so they just pack more room instead. Look at memory usage in Tomb Raider, its nuts and its a game that does it better than others. HBM can take advantage of system memory to store all the junk that games try to load that they don't actually need, and can swap it in/out as needed.

I mean Crysis ran on cards with only 1GB of ram or less, yet games these days take 4GB+...

Only at the high 2560x1600 resolution is the 512MB card experiencing a slight performance penalty

Developers stopped optimizing and instead we just throw better/bigger hardware at it.
 
Don't start jumping to conclusions. They optimize the memory to take advantage of poor memory usage by game engines at hi resolution. How much, how far, and what limitations there are to that are unknown.

It's happened way too many times to be a coincidence, both in DX11 and DX12 games.

Fiji does need optimized drivers to perform well, while older GCN SKUs seem to run well out of the box.
 
It's happened way too many times to be a coincidence, both in DX11 and DX12 games.

Fiji does need optimized drivers to perform well, while older GCN SKUs seem to run well out of the box.


You don't know what's what nor do I or anyone else for that matter. We do however know what they've said that they done which as bacon wrote is to manage the WASTE better which is something no one does atm. I think after DX12 has had more time to grow it will not be an issue and ppl will forget if they even remembered.
 
You don't know what's what nor do I or anyone else for that matter. We do however know what they've said that they done which as bacon wrote is to manage the WASTE better which is something no one does atm. I think after DX12 has had more time to grow it will not be an issue and ppl will forget if they even remembered.

Going from what AMD have said, then yes, HBM requires optimized memory usage and so Fiji requires game ready drivers to perform.

That is a concern, because AMD is often not game ready until post release. Hopefully they can sort out the kinks so Polaris won't suffer without game specific optimized drivers.
 
Going from what AMD have said, then yes, HBM requires optimized memory usage and so Fiji requires game ready drivers to perform.

That is a concern, because AMD is often not game ready until post release. Hopefully they can sort out the kinks so Polaris won't suffer without game specific optimized drivers.


Great make stuff up now this is going to be the next troll by the group of you know whos.
 
It is probably a "feature" of HBM, and so Polaris or Pascal HBM2 SKU will likely need this optimization too.

I doubt that there's anything specific about HBM that requires it. More likely, what we're seeing with Fiji is a result of its highly unbalanced design.

From TechReport:

In other respects, including peak triangle throughput for rasterization and pixel fill rates, Fiji is simply no more capable in theory than Hawaii. As a result, Fiji offers a very different mix of resources than its predecessor. There's tons more shader and computing power on tap, and the Fury X can access memory via its texturing units and HBM interfaces at much higher rates than the R9 290X.


In situations where a game's performance is limited primarily by shader effects processing, texturing, or memory bandwidth, the Fury X should easily outpace the 290X. On the other hand, if gaming performance is gated by any sort of ROP throughput—including raw pixel-pushing power, blending rates for multisampled anti-aliasing, or effects based on depth and stencil like shadowing—the Fury X has little to offer beyond the R9 290X. The same is true for geometry throughput.

Fiji really should have more ROPs. But they ran out of room on the die and couldn't fit it. That isn't just speculation, Raja Koduri said so:

The reason why Fiji isn't any larger, he said, is that AMD was up against a size limitation: the interposer that sits beneath the GPU and the DRAM stacks is fabricated just like a chip, and as a result, the interposer can only be as large as the reticle used in the photolithography process. (Larger interposers might be possible with multiple exposures, but they'd likely not be cost-effective.) In an HBM solution, the GPU has to be small enough to allow space on the interposer for the HBM stacks. Koduri explained that Fiji is very close to its maximum possible size, within something like four square millimeters.

Hawaii is effective right off the bat without game-specific optimizations because it is one of AMD's most well-balanced chips.

Tahiti and Tonga also suffer from having too few ROPs per shader (in fact, the ratio is the same as in Fiji). I don't understand why they didn't fix this in Tonga. They could have cut out the useless extra 64 bits of memory controller that was never used on any shipping configuration, and used the space to enhance the ROP count from 32 to 48. (Or is there something about GCN that requires them to be added in powers of 2? Nvidia doesn't have that limitation, since GM200 has 96 ROPs.) That would have made a big difference.

Once we get 14nm Polaris GPUs, I don't think we'll see these oddball problems because they'll have plenty of room to fit a decent front end.
 
Once we get 14nm Polaris GPUs, I don't think we'll see these oddball problems because they'll have plenty of room to fit a decent front end.

It's all speculation right now (as to why), but Fiji looks awful without game ready drivers from AMD. Call it what it is, but if you were a Fiji owner, you would not be impressed either. 😉

I hope this is a Fiji issue and not a HBM issue, we'll just have to wait and see!
 
It's all speculation right now (as to why), but Fiji looks awful without game ready drivers from AMD. Call it what it is, but if you were a Fiji owner, you would not be impressed either. 😉

Agreed. I see Fiji as a tech demo that was released simply because AMD had nothing else new to offer. Things would have been very different if 20nm hadn't shit the bed.
 
LOL ain't that the truth. Look at that poor 780ti flagship. Can't even follow in the shadows of the 970 with any dignity at all. Pathetic. The entire rest of the Kepler line up is so terrible they aren't even worth listing lol. Those are basically R9 290 vs 780ti results in those charts. If anyone would have known back then what we know now, no one would have touched Kepler. They would have rotted on the store shelves.

When you put it like that...

tGub8uH.jpg


That's basically a non-reference 290 at stock clocks vs a factory OC 780Ti. 😵

How the mighty Kepler has fallen, every single AAA title in 2016 so far, it's been having a dirt nap.

Hawaii, made to compete against 780, Titan and 780Ti, now bests Maxwell 970/980 in modern games, and on the heels of the big GM200 in DX12.

Hawaii chip => Legendary status earnt!

If it keeps performing this well in new titles, is there any incentive for users to upgrade to Polaris? Those are more than playable even at 1440p.
 
Now Hawaii is competing with Fiji. Great. It's like it never stops.
Next thing you know, Hawaii will be optimized to beat Polaris and Pascal 🙁
Obviously I'm just kidding by the way...
 
Now Hawaii is competing with Fiji. Great. It's like it never stops.
Next thing you know, Hawaii will be optimized to beat Polaris and Pascal 🙁
Obviously I'm just kidding by the way...

Kid or not, Fiji is pretty crap without special drivers for it. I'm just glad I didn't waste $ on it hehe.

Come next gen, since I'm not at 4K, why do I even need to upgrade to Polaris, my R290X OCs to 1.2ghz to get faster than 390X performance and that's still enough for 1440p. -_-

Guess its 4K time?
 
I doubt that there's anything specific about HBM that requires it. More likely, what we're seeing with Fiji is a result of its highly unbalanced design.

From TechReport:



Fiji really should have more ROPs. But they ran out of room on the die and couldn't fit it. That isn't just speculation, Raja Koduri said so:



Hawaii is effective right off the bat without game-specific optimizations because it is one of AMD's most well-balanced chips.

Tahiti and Tonga also suffer from having too few ROPs per shader (in fact, the ratio is the same as in Fiji). I don't understand why they didn't fix this in Tonga. They could have cut out the useless extra 64 bits of memory controller that was never used on any shipping configuration, and used the space to enhance the ROP count from 32 to 48. (Or is there something about GCN that requires them to be added in powers of 2? Nvidia doesn't have that limitation, since GM200 has 96 ROPs.) That would have made a big difference.

Once we get 14nm Polaris GPUs, I don't think we'll see these oddball problems because they'll have plenty of room to fit a decent front end.
GCN 1.1 and 1.2 have a maximum of:

4 ROPs per RBE, 4 RBEs per Shader Engine and 4 Shader Engines.

4(4x4)=64 ROps.

For Fiji, AMD would have had to add more shader engines which would in turn require more L2 cache in order to maintain the 1:1 performance scaling ratio.

This wasn't possible on 28nm for the reasons Raja Mentioned.

For Tonga, AMD could have moved to 3 RBEs per shader engine.

4(3x4)=48 ROps.

They probably didn't do this due to the die size and price point the GPUs were aimed at.

As for Rasterizers, 1 Rasterizer per Shader Engine so 4 Triangle Ops per clock. For Fiji that's 4.2 Gtris/s.

Something tells me that AMD will be moving to 6 shader engines at some point in the near future.
 
Going from what AMD have said, then yes, HBM requires optimized memory usage and so Fiji requires game ready drivers to perform.

That is a concern, because AMD is often not game ready until post release. Hopefully they can sort out the kinks so Polaris won't suffer without game specific optimized drivers.

AMD said they need to optimize for the 4GB not to be a limitation (2 engineers working on this). It has absolutely nothing to do with HBM. It could just as well have been GDDR5, DDR4 or whatever.
 
AMD said they need to optimize for the 4GB not to be a limitation (2 engineers working on this). It has absolutely nothing to do with HBM. It could just as well have been GDDR5, DDR4 or whatever.

Sorry but 4GB is not a reason for under-performance when the R290/X performs so well and it's 4GB. Plus these are at resolutions where 4GB vram isn't a limitation, some of these games don't even saturate it.
 
Sorry but 4GB is not a reason for under-performance when the R290/X performs so well and it's 4GB. Plus these are at resolutions where 4GB vram isn't a limitation, some of these games don't even saturate it.

Its already mentioned before. Fiji is an unbalanced chip. It got nothing to do with HBM. If you are ROP limited, then Fiji is really just a Hawaii.

GTX980 isn't beating it because its faster overall. But because in this case it got faster clocked ROPs.
 
Last edited:
i actually think fiji with hbm is more of a prove of concept and some kind of pipe cleaner. the only woth is nano rest is nice to have nothing more.
 
Its already mentioned before. Fiji is an unbalanced chip. It got nothing to do with HBM. If you are ROP limited, then Fiji is really just a Hawaii.

GTX980 isn't beating it because its faster overall. But because in this case it got faster clocked ROPs.

That accounts for the cases where Fiji starts slow and stays slow, but there are also cases where it starts slow and gets a driver that improves its speed.
 
My take on Fiji is this: Fiji is like a GCN fenomenom inside of a GCN fenomenom.
GCN is maturing. It took time for console optimizations to benefit card that share the architecture.
Fiji has unique memory that needs the specific treatment to give full benefit. My guess is, when time comes, Fiji with HBM will get the second wind. But for that, we need more HBM based cards, so that developers have a userbase to target the optimizations for.
 
Tied in 3440x1440. Point?

You can buy a Radeon 390 for $275 USD today (and just about every other week, it seems). A GTX 980 costs $440 on sale.

A 390 is supposed to be a tier and a half below the 980, but in recent games appears to come close, match, or even beat a 980 in performance.
 
36% lead for the Fury X over the 290X at 1440p, but I'm fairly certain that's a reference 290X and a 390X would be close to 10% faster than that. At least the Nano is a decent step over the 290X unlike pcgameshardware where it's 1FPS faster than a 390.
 
Back
Top