Separate names with a comma.
Discussion in 'Video Cards and Graphics' started by Bacon1, Apr 15, 2016.
No major Async compute (r)evolution till Volta I bet D:
Pretty much what I said in a recent video. Async is good, but it's hype right now and requires optimization on a card-per-card basis, which few devs will be arsed with.
Going forward though we can easily imagine a case where it won't need that level of tweaking, or any at all.
Crap, I guess I should take my 980ti's out of the trash bin.
Too late, already using them and about to add them to my signature.
It's like I've said though, a 980ti owner should not care. They'll upgrade before this matters. Midrange users who are bigger gamers don't upgrade. They hold cards for 3-5 years.
I consider people who buy mid range cards regularly to still be enthusiasts by the way.
And yet, we have this.
Async Compute in particular has received a lot of attention from PC enthusiasts, specifically in regards to NVIDIA GPUs lacking hardware support for it. However, in the GDC 2016 talk you said that even AMD cards only got a 5-10% boost and furthermore, you described Async Compute as super hard to tune because too much work can make it a penalty. Is it fair to say that the importance of Async Compute has been perhaps overstated in comparison to other factors that determine performance? Do you think NVIDIA may be in trouble if Pascal doesnt implement a hardware solution for Async Compute?
The main reason its hard is that every GPU ideally needs custom tweaking the bandwidth to compute ration is different for each GPU, ideally requiring tweaking the amount of async work for each one. I dont think its overstated, but obviously YMMW (your mileage may vary). In the current state, Async compute is a nice & easy performance win. In the long run it will be interesting to see if GPUs get better at running parallel work, since we could potentially get even better wins.
Yes but 1% is a performance win as well. Hitman shows around 3% on average I believe, not 5-10%.
AMD has a lot bigger advantages than async but they've marketed it well, which makes a change for them.
^ Yes, but it seems for console ports, it's most likely going to be the case of developers squeezing the last ounce of performance from underpowered XB1/PS4. If the games are originally designed for those consoles to take advantage of Async Compute, it directly translates into a win for PC hardware that supports it. Otoh, Steam data shows that most GPUs (NV) don't support this functionality. This means if an XB1/PS4 game doesn't have Async Compute, it means the developer has to spend extra resources incorporating it into the PC version. How likely are they to do this when AMD's market share is 20-21% right now? It's up to AMD to work with developers then or the rest seems to be dictated by the nature of GCN-optimized XB1/PS4 ports.
If Pascal barely improves Async Compute, add another 2+ years of software delay because it would mean hardware wise Async Compute still won't be mainstream. It's a shame really since AC is a performance boosting feature on more advanced GPU architectures. Who doesn't want another 10-30% boost in performance from hardware that allows parallel processing? I guess the answer to that is obvious...
The key here is standardization. AMD messed around a lot with the first implementation of async but now it looks like they've got it settled. No change in Polaris either.
Async will really start to count when async on the console = async on the PC exactly - with either utterly trivial optimization or none at all, likely involving the new patent that's being discussed as well.
All AMD has to do is provide the standardization. They've got plans going way beyond async though.
Hitman is up to 10%. For Hawaii and Fury it's ~10% while for Tahiti it's like 3% or so.
But AFAIK, they only used it for SSAO, shadows for "free" where there's enough ACEs.
In Ashes, most get 10-20%.
And it's because they put more compute in the Async queue, for all their unit lighting.
Want to see something mind blowing?
Heavy usage of Async Compute in QB, but not touching a single SP/ALU, all on the DMA engines in GCN.
Don't mistake DX12 for async. Only Ashes shows a big difference that is clear and can be toggled.
It's real as soon as there are a bunch of benchmarks showing similar results with async on and off. Until then, Ashes is an AMD poster child.
The consistent wins that AMD is getting without async is by far the more important DX12 story.
Sure, but many people misunderstood the purpose of Async Compute. They often attribute it to better shader utilization and that's just one part of it.
The example of AC in QB is very striking, heavy Copy Queues to get their 4 frame temporal reconstruction rendering to work. In DX12, Copy Queues are a subset of Compute queues, which are a subset of Graphics queues.
Copy Queues don't even touch shaders, yet, GCN is able to accelerate performance far above NV in QB, just because they support DX12's Multi-Engine Rendering and "Async Compute".
If it's not Async Compute, why would GCN cards perform much better under DX12? Lower CPU overhead which allows the GPU to become better utilized?
Could it have more to do with latest games becoming so advanced for XB1/PS4 that developers are forced to squeeze/optimize for every last ounce of GCN?
I am inclined to believe this explanation more. How else can we explain Hitman performing so much faster on 290X/Fury X under DX11?
It's mind-blowing how much faster R9 290X is over 7970/280X or over 780Ti/970. Console effect imo.
The easy explanation is that AMD sponsored Hitman, and so IO/Square gimps NV GPU performance. The same for Ashes of the Singularity.
But we're seeing even more gimpage in neutral non-sponsored titles so I don't agree that is the cause.
It's a combination of console effect and NV's gimped DX12 hardware/drivers all adding up to a storm with Polaris v Pascal.
A few weeks ago I had major doubts a 2,560 SP Polaris 10 could rival a 2,560 SP GP104 that's a much bigger die... but really, as review sites ditch older games and add new ones, NV GPUs need to brute force a lot to catch up.
That 980 Ti pownz this whole freakin list...
Don't be so sure. P100 boosts to 1480mhz. Based on more conservative GPU clocks of NV's larger die products over the years vs. their mid-range and lower-end offerings, I am inclined to believe that 1080 GP104 (980 replacement) will be clocked higher than 1480mhz. Add in after-market versions, I wouldn't be shocked if GP104 can overclock to 1700mhz. For leaked Polaris clocks, we are seeing 850-1050mhz. That means I expect a 40-60% clock speed disadvantage for Polaris, but who knows. Remember before 670/680 launched, early leaks were showing them with 700mhz or so GPU clocks. Either way, since HD5870, AMD went from 850mhz to 1050mhz with Fury X. And now look how high NV's cards clock today. Even on 28nm, Maxwell overclocks to 1500-1550mhz.
Base clock is meaningless when they can power gate down and turbo boost individual units within a SIMD with Polaris.
So basically expect Hawaii to get most of the benefit.
The first game I noticed it in DX11 was SW:BF. That DICE would be using GCN cards makes complete sense of course, so yes I think we're simply looking at the fact that more devs are just using GCN to start with. If you think about it, AMD probably never had this in their history, yet still stayed pretty close to Nvidia in most cases...
I first noticed it in Shadow of Mordor.
On Anandtech's bench a reference R290X was keeping up or slightly faster than a 980. That was very unexpected given older titles had a 15% gap.
Basically without GameWorks, NV can't compete when modern games come GCN optimized.
Examine The Division, NV sponsored, a lot of GameWorks tech, but as soon as you disable those GW features (PCSS & HBAO+)... GCN just powers ahead.
This is repeated for Far Cry 4, Dying Light, Rainbow Six, JC3 and other NV sponsored titles. Disable GW, bam, GCN goes ahead at each segment.
What about games where NV don't sponsor? Best example is Far Cry Primal, where a 390 is 30% faster than the 970.
NV actually need to sponsor and get involved with all the games, else GCN just runs too good. And this is in DX11 where GCN is running crippled.
Shadow of Mordor could have been an edge case due to it being so heavy on memory and bandwidth, but yes that one ran surprisingly well on AMD too.
just need something to drive my 3440x1440 monitor at full res... 980ti isn't enough currently... maybe the radeon pro duo...
I've often said in the past that AMD won't be able to truly compete in the 3D workstation market until they can pry the Quadros out of Autodesk's workstations. When the app is developed on a particular IHV's products they are going to have an inherent advantage. I never thought about the game devs, but it makes sense there too.
Yes, and here is why there is such a lag with optimizations for GCN, since we are well in the console lifecycle:
Here is why:
Devs had no hope in PS4 and Xbone. Nobody expected the amazing sales consoles have. Games were not developed for the next gen until it became obvious how big the next gen has become.
Now that they actually try, we see a console effect at play in PC GPU landscape shifting.