[bitsandchips]: Pascal to not have improved Async Compute over Maxwell

Page 7 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

sontin

Diamond Member
Sep 12, 2011
3,273
149
106
They've stated that they send builds with source code access to nVidia (and AMD). Why should you having it be relevant?

Between "sending source code" and "changing source code" is a huge difference. Oxide hasnt done anything to improve the performance on nVidia cards. The engine is an unoptimized mess and uses compute just for the fun.

Here is a presentation of Async Compute in this game: http://www.hardware.fr/news/14555/gdc-async-compute-aots-details.html

"Shadows" are simple and always one frame behind. Latency increased by 33%. They tried so hard to make AMD's cards shine that they forget to optimize their engine for all players.
 

3DVagabond

Lifer
Aug 10, 2009
11,951
204
106
Between "sending source code" and "changing source code" is a huge difference. Oxide hasnt done anything to improve the performance on nVidia cards. The engine is an unoptimized mess and uses compute just for the fun.

Here is a presentation of Async Compute in this game: http://www.hardware.fr/news/14555/gdc-async-compute-aots-details.html

"Shadows" are simple and always one frame behind. Latency increased by 33%. They tried so hard to make AMD's cards shine that they forget to optimize their engine for all players.

So you've forgotten when they said this, "Personally, I think one could just as easily make the claim that we were biased toward Nvidia as the only 'vendor' specific code is for Nvidia"?
 

Spanners

Senior member
Mar 16, 2014
325
1
0
Between "sending source code" and "changing source code" is a huge difference. Oxide hasnt done anything to improve the performance on nVidia cards. The engine is an unoptimized mess and uses compute just for the fun.

"For example, when Nvidia noticed that a specific shader was taking a particularly long time on their hardware, they offered an optimized shader that made things faster which we integrated into our code." Link.

Are you saying Oxide are openly lying or are you just incorrect? I can't see a third option.
 

Magee_MC

Senior member
Jan 18, 2010
217
13
81
Between "sending source code" and "changing source code" is a huge difference. Oxide hasnt done anything to improve the performance on nVidia cards. The engine is an unoptimized mess and uses compute just for the fun.

Here is a presentation of Async Compute in this game: http://www.hardware.fr/news/14555/gdc-async-compute-aots-details.html

"Shadows" are simple and always one frame behind. Latency increased by 33%. They tried so hard to make AMD's cards shine that they forget to optimize their engine for all players.

Oxide sends both AMD and NV the source code and they are allowed to submit changes. As long as the change doesn't harm performance on the other vendor's cards they will include it.

Additionally, NV cards are currently running NV optimized shaders in the game. That sounds to me like they have done something to improve performance on NV cards.

You are flat out wrong on the facts.
 

Magee_MC

Senior member
Jan 18, 2010
217
13
81
"For example, when Nvidia noticed that a specific shader was taking a particularly long time on their hardware, they offered an optimized shader that made things faster which we integrated into our code." Link.

Are you saying Oxide are openly lying or are you just incorrect? I can't see a third option.

I can see a third option. He is saying that Oxide are openly lying AND he is just incorrect.
 

ThatBuzzkiller

Golden Member
Nov 14, 2014
1,120
260
136
He's just stating a fact, not making a marketing presentation.

His attitude was vehemently unwelcoming ...

And it tanks performance, massively.

That's not looking well for a "feature" when performance drops off the cliff.

It's like GameWorks VXAO, Maxwell only, enjoy your major performance loss for almost nil visual gains.

Async Compute is "Multi-Threading for GPUs" as the media often say, it's a performance enhancement. Devs can use compute to make cool features and have it run for "free".

If Pascal is gimped on this DX12/Vulkan feature, it means there could be a weird situation. DX12 PC ports from consoles may come with Async Compute by default. If NV doesn't like that, they need to sponsor the port and rip it out, or even push it back to DX11, though studios may object to that later this year as more AAA DX12 titles make it a cool thing to have.

I know you don't like GameWorks but even I don't deny that it brings in some of the highest quality effects in real time like HTFS for the highest quality shadows (for years developers were cursed to live with shadow maps which brought nothing but aliasing, acne, and the dreaded light bleeding too) and VXAO which delivers view independent occlusion ...
 

ThatBuzzkiller

Golden Member
Nov 14, 2014
1,120
260
136
Was there something prior to this that I missed?



Just seems like he's stating that Intel has superior support.

I don't deny that Intel has superior support but his proceeding posts came off as overly confrontational and he refused to accept a fact on top of it ...
 

nsavop

Member
Aug 14, 2011
91
0
66
Im kind of confused. Since when did DX12 performance hinge on async compute?

Does Maxwell do poorly in DX12 games? I haven't seen anything like that unless I missed something? feel free to direct me these benchmarks..

As long as both IHVs can produce high playable fps in future DX12 games regardless how it gets it done (there is simply no one "right" way to do things in any aspects of technology), its fine by me. This stuff is just pure nonsense unless we see the benches and products themselves. If AMD has the upper hand and I need to upgrade, then AMD for my next upgrade vice versa.

Til then, the talk of async compute, DX12 performance and what not doesn't have any real substance. Both seem to relatively perform well but we know there aren't native DX12 games out there yet (or is there? haven't really followed games inawhile) to fully test the API's capabilities. By that time, these new products will be out with higher performance.

It's the same AMD hype train. Async is a nice bonus for AMD no doubt but not some killer feature that by itself will drastically change how future games run.
IMG_2050.jpg

Asynchronous compute granted a gain of 5-10% in performance on AMD cards, and unfortunately no gain on Nvidia cards, but the studio is working with the manufacturer to fix that. They’ll keep on trying.

The downside of using asynchronous compute is that it’s “super-hard to tune,” and putting too much workload on it can cause a loss in performance.
http://www.dualshockers.com/2016/03/15/directx-12-compared-against-directx-11-in-hitman-advanced-visual-effects-showcased/
 

Raising

Member
Mar 12, 2016
120
0
16
Wasn't increased draw calls the most significant thing about dx12 ? Why all the fuss about async when it's just one of many dx12 features ?
 

Cookie Monster

Diamond Member
May 7, 2005
5,161
32
86
Wasn't increased draw calls the most significant thing about dx12 ? Why all the fuss about async when it's just one of many dx12 features ?

Because according to some questionable posters that like to post things that they think they know about are parading the videocard forums that AC = DX12 and that if you somehow if one does not dont support this in the hardware its the end of the particular company? Sorry for my ignorance but this is the vibe I get from VC&G these days.

Yet I see benchmarks and see financials/marketshare, I keep seeing the opposite happening? From my small knowledge in GPUs, its a little more complicated than that and async compute is one part many of features in the DX12 API just like increased drawcalls, decreased overheads etc.

Regardless of how one IHV does it or not in their GPUs, as long as the performance is there its literally meaningless.
 

Bryf50

Golden Member
Nov 11, 2006
1,429
51
91
It's one of the few features of DX12 that is a clear performance advantage. Is it the end of the world not having it? No. But it does mean that Nvidia has to work harder for that 5%-10% "free" performance from Async Compute. Just imagine if the Fury X was 5-10% faster overall. It would be game changing to the perception of the card at release.
 

Erenhardt

Diamond Member
Dec 1, 2012
3,251
105
101
Wasn't increased draw calls the most significant thing about dx12 ? Why all the fuss about async when it's just one of many dx12 features ?

Both are very important. But everyone accepted that amd benefits heck of a lot more from increased drawcalls than nvidia. It is because of differences between their dx11 drivers. It is closely tied to CPU performance rather than GPU, so it is more of and CPU section thing.

Async compute can give lots of performance (10% and more). By video cards and graphics forum standards it means blowing out of the water/kicking ass/making smth obsolate/beating/spanking/trashing/ and all sorts of other funny phrases. It is a big thing apparently, and if nvidia is going crippled yet another generation, it is a big news.

What nv can do is to offer more hardware for the same price to keep up with GCN, which will benefit their customers.

Or they can go nvidia about it and block AC in new games by GW partnership, which will hurt all gamers (but amd side more :cool:)

Nv not supporting AC is a big thing.
 

coercitiv

Diamond Member
Jan 24, 2014
7,226
16,986
136
Because according to some questionable posters that like to post things that they think they know about are parading the videocard forums that AC = DX12 and that if you somehow if one does not dont support this in the hardware its the end of the particular company? Sorry for my ignorance but this is the vibe I get from VC&G these days.

Yet I see benchmarks and see financials/marketshare, I keep seeing the opposite happening? From my small knowledge in GPUs, its a little more complicated than that and async compute is one part many of features in the DX12 API just like increased drawcalls, decreased overheads etc.
Look at it the other way around: how come this feature with relatively low impact on performance managed to get such an allergic reaction from certain fans on the forum? People filled entire forum pages trying to prove AC is supported and will bring performance benefits to Maxwell as well. Only when it became painfully obvious that wasn't going to happen did the new rhetoric start - AC is a minor performance benefit that can be overcome with raw power.

All this talk is not about who will take the performance crown in 2016-2017, it's about admitting AMD has even a slight tiny narrow nano advantage over the competition. It's about each team trying to put their flag on the top of the mountain before the DX12 craze begins, even if months later, when the camera pans out, it will turn out to be a pathetic molehill.
 

thesmokingman

Platinum Member
May 6, 2010
2,302
231
106
Was there something prior to this that I missed?



Just seems like he's stating that Intel has superior support.


I did exactly that, state a simple point to which he replied with a diatribe about how irrelevant Intel is. It's like he's baiting me into hey look over here... lol. If you have to state how unbiased you are... chances are that you're trying to prove something. End of the day NV's CR support gives the same type of results that it's claimed AC support in hardware shows. I'm not sure that's much to brag about.



Intel is irrelevant with the pathetically low primitive rate that they push their iGPUs out ...

If people think tessellation is bad on AMD wait till they see how much more poorly it scales on Intel GPUs ...

Fixed function units isn't exactly Intel's strength ...
 

TheRyuu

Diamond Member
Dec 3, 2005
5,479
14
81
It's one of the few features of DX12 that is a clear performance advantage. Is it the end of the world not having it? No. But it does mean that Nvidia has to work harder for that 5%-10% "free" performance from Async Compute. Just imagine if the Fury X was 5-10% faster overall. It would be game changing to the perception of the card at release.

It may be 5-10% for Fury X but I'm not sure you'd see the same gains on Nvidia cards even if it had hardware support for async compute. We have to keep in mind that because of how the cards are designed you see those async compute gains on AMD hardware because (IIRC) it's harder for AMD cards to reach their maximum throughput so they they have have the required overhead to see a gain from async compute (I can't find the source for this but I believe it's been linked to in one of these async threads).

As others have said async compute is not the only feature that DX12 offers to developers. Furthermore you can still see gains from using compute shaders even under with the non-optimal implementation for Maxwell so long as they they are not interleaved and you properly batch the jobs[1].

I look at it this way, it's not so much that Maxwell sucks at DX12 as it's just harder to see the gains from using it because it's harder to program for because of design decisions that Nvidia made wrt efficiency. Couple that with the fact that the best case for AMD is essentially the worst case for Nvidia[1] and the result is likely what's so far been observed with DX12 titles. I still don't think the sample size is large enough to draw conclusions about Nvidia's architecture in the DX12 era other than it's definitely more awkward to program for.

It may be worth noting that Maxwell2 (second gen maxwell) does support async compute although in a more limited fashion than what's required by the DX12 specs. Mahigan explained it well in another async compute thread[2] (the Doom one):
CUDA applications support Asynchronous compute via Hyper-Q and since PhysX is CUDA based, it supports Asynchronous compute + graphics on Maxwell (GM20x).

Hyper-Q isn't compatible with DX12 barriers and or fences (we're not sure which one) which is why GM20x doesn't support Async compute + graphics under DX12.

Hyper-Q bypasses the Command Processor in GM20x and is handled by a dedicated ARM processor on the GM20x die. This dedicated ARM processor can feed both 3D jobs and compute jobs concurrently and in parallel to GM20x.

It is more than likely that NVIDIA were caught by a minor DX12 API spec.

As for context switches, they occur in two stages.
1. During the execution of work loads.
2. Within the SMMs themselves.

The first can be alleviated by use of Hyper-Q in CUDA applications but the second cannot due to the shared L1 texture/compute caches within an SMM. Basically, an SMM cannot be performing both compute and texture jobs at once due to shared logic. A full flush is required to switch from one context to another within an SMM.

I do wonder if async compute could be supported without the need for the first context switch in the list above with modifications to just the ARM processor to support the barriers/fences or if a more invasive approach is required.

We can speculate that if Pascal shares similarities with Maxwell that it may indeed not be possible to alleviate the issue of hardware async but this is just speculation and I don't think it should be considered a deal breaker even if it's true. We just have to wait and see at this point.

[1] http://ext3h.makegames.de/DX12_Compute.html
[2] http://forums.anandtech.com/showthread.php?p=38115150#post38115150
 
Last edited:

iiiankiii

Senior member
Apr 4, 2008
759
47
91
The reason why AC is important because it's in the consoles. That's it. Consoles will be using it. As a consequence, AMD will benefit indirectly from it. Nvidia, if it lacks AC hardware acceleration, won't benefit from it. It's pretty simple. Nvidia, like always, will find a way to stay competitive. They will leverage GameWorks so that Pascal will remain competitive. All other GPUs (Maxwell, Kepler and GCN) will suffer because of it. Believe that.
 

thesmokingman

Platinum Member
May 6, 2010
2,302
231
106
The reason why AC is important because it's in the consoles. That's it. Consoles will be using it. As a consequence, AMD will benefit indirectly from it. Nvidia, if it lacks AC hardware acceleration, won't benefit from it. It's pretty simple. Nvidia, like always, will find a way to stay competitive. They will leverage GameWorks so that Pascal will remain competitive. All other GPUs (Maxwell, Kepler and GCN) will suffer because of it. Believe that.


We don't have to believe it because they already do that at the consumers expense.
 

stuff_me_good

Senior member
Nov 2, 2013
206
35
91
I just don't get this async or not async hate and blame war between both camps. AMD has async and nVidia doesn't, so what? Pascal will be powerful without it and not all games going to use it anyways so why waste your time here trying to brainwash everybody?
 

zlatan

Senior member
Mar 15, 2011
580
291
136
I don't have too much time now to write a novel, but a lot of people oversimplify this async compute thing. In the reality there is no such thing in D3D12. The feature called multi-engine, and it is mandatory to support it, even if the hardware don't able to execute the selected pipelines in parallel. And the number of dedicated compute command engines are not a useful data without knowing the underlying architecture design.
 

dacostafilipe

Senior member
Oct 10, 2013
797
298
136
I do think that AC is a nice feature, not only because of the possible performance gains, but also from a technical view.

But until we know if "real" Multi-engine support (aka overlapping shaders) would result in a performance increase on Maxwell, I would not call it a "fail" on nVidia.

Man, AMD's marketing did an nVidia here with the AC story and it worked xD
 

showb1z

Senior member
Dec 30, 2010
462
53
91
I do think that AC is a nice feature, not only because of the possible performance gains, but also from a technical view.

But until we know if "real" Multi-engine support (aka overlapping shaders) would result in a performance increase on Maxwell, I would not call it a "fail" on nVidia.

Man, AMD's marketing did an nVidia here with the AC story and it worked xD

They brought it on themselves, no? Claiming they support AC. Telling Oxide they will fully implement it through a driver, 7 months ago.
Lying to developers and customers seems like a pretty big fail to me.
 

airfathaaaaa

Senior member
Feb 12, 2016
692
12
81
Because according to some questionable posters that like to post things that they think they know about are parading the videocard forums that AC = DX12 and that if you somehow if one does not dont support this in the hardware its the end of the particular company? Sorry for my ignorance but this is the vibe I get from VC&G these days.

Yet I see benchmarks and see financials/marketshare, I keep seeing the opposite happening? From my small knowledge in GPUs, its a little more complicated than that and async compute is one part many of features in the DX12 API just like increased drawcalls, decreased overheads etc.

Regardless of how one IHV does it or not in their GPUs, as long as the performance is there its literally meaningless.
problem is most of the commands that dx12 incorporates doesnt seems to help at all any maxwell card not in the way they kept bragging about since 2014... http://www.legitreviews.com/nvidia-highlights-directx-12-strengths-amd_138178
fact is since we saw the revised roadmap that included pascal in the early 2015 (pascal didnt existed on any roadmap before that) we already knew they not gonna have anything ready