[bitsandchips]: Pascal to not have improved Async Compute over Maxwell

sontin · Mar 24, 2016

3DVagabond said:
They've stated that they send builds with source code access to nVidia (and AMD). Why should you having it be relevant?

Between "sending source code" and "changing source code" is a huge difference. Oxide hasnt done anything to improve the performance on nVidia cards. The engine is an unoptimized mess and uses compute just for the fun.

Here is a presentation of Async Compute in this game: http://www.hardware.fr/news/14555/gdc-async-compute-aots-details.html

"Shadows" are simple and always one frame behind. Latency increased by 33%. They tried so hard to make AMD's cards shine that they forget to optimize their engine for all players.

3DVagabond · Mar 24, 2016

sontin said:
Between "sending source code" and "changing source code" is a huge difference. Oxide hasnt done anything to improve the performance on nVidia cards. The engine is an unoptimized mess and uses compute just for the fun.

Here is a presentation of Async Compute in this game: http://www.hardware.fr/news/14555/gdc-async-compute-aots-details.html

"Shadows" are simple and always one frame behind. Latency increased by 33%. They tried so hard to make AMD's cards shine that they forget to optimize their engine for all players.

So you've forgotten when they said this, "Personally, I think one could just as easily make the claim that we were biased toward Nvidia as the only 'vendor' specific code is for Nvidia"?

Spanners · Mar 24, 2016

sontin said:
Between "sending source code" and "changing source code" is a huge difference. Oxide hasnt done anything to improve the performance on nVidia cards. The engine is an unoptimized mess and uses compute just for the fun.

"For example, when Nvidia noticed that a specific shader was taking a particularly long time on their hardware, they offered an optimized shader that made things faster which we integrated into our code." Link.

Are you saying Oxide are openly lying or are you just incorrect? I can't see a third option.

Magee_MC · Mar 24, 2016

sontin said:
Between "sending source code" and "changing source code" is a huge difference. Oxide hasnt done anything to improve the performance on nVidia cards. The engine is an unoptimized mess and uses compute just for the fun.

Here is a presentation of Async Compute in this game: http://www.hardware.fr/news/14555/gdc-async-compute-aots-details.html

"Shadows" are simple and always one frame behind. Latency increased by 33%. They tried so hard to make AMD's cards shine that they forget to optimize their engine for all players.

Oxide sends both AMD and NV the source code and they are allowed to submit changes. As long as the change doesn't harm performance on the other vendor's cards they will include it.

Additionally, NV cards are currently running NV optimized shaders in the game. That sounds to me like they have done something to improve performance on NV cards.

You are flat out wrong on the facts.

Magee_MC · Mar 24, 2016

Spanners said:
"For example, when Nvidia noticed that a specific shader was taking a particularly long time on their hardware, they offered an optimized shader that made things faster which we integrated into our code." Link.

Are you saying Oxide are openly lying or are you just incorrect? I can't see a third option.

I can see a third option. He is saying that Oxide are openly lying AND he is just incorrect.

raghu78 · Mar 24, 2016

Magee_MC said:
You are flat out wrong on the facts.

Don't waste your time reasoning with him.

ThatBuzzkiller · Mar 24, 2016

3DVagabond said:
He's just stating a fact, not making a marketing presentation.

His attitude was vehemently unwelcoming ...

Silverforce11 said:
And it tanks performance, massively.

That's not looking well for a "feature" when performance drops off the cliff.

It's like GameWorks VXAO, Maxwell only, enjoy your major performance loss for almost nil visual gains.

Async Compute is "Multi-Threading for GPUs" as the media often say, it's a performance enhancement. Devs can use compute to make cool features and have it run for "free".

If Pascal is gimped on this DX12/Vulkan feature, it means there could be a weird situation. DX12 PC ports from consoles may come with Async Compute by default. If NV doesn't like that, they need to sponsor the port and rip it out, or even push it back to DX11, though studios may object to that later this year as more AAA DX12 titles make it a cool thing to have.

I know you don't like GameWorks but even I don't deny that it brings in some of the highest quality effects in real time like HTFS for the highest quality shadows (for years developers were cursed to live with shadow maps which brought nothing but aliasing, acne, and the dreaded light bleeding too) and VXAO which delivers view independent occlusion ...

3DVagabond · Mar 24, 2016

ThatBuzzkiller said:
His attitude was vehemently unwelcoming ...

Was there something prior to this that I missed?

thesmokingman said:
They only partially. Intel has better support for it than anyone.

Just seems like he's stating that Intel has superior support.

ThatBuzzkiller · Mar 24, 2016

3DVagabond said:
Was there something prior to this that I missed?

Just seems like he's stating that Intel has superior support.

I don't deny that Intel has superior support but his proceeding posts came off as overly confrontational and he refused to accept a fact on top of it ...

nsavop · Mar 24, 2016

Cookie Monster said:
Im kind of confused. Since when did DX12 performance hinge on async compute?

Does Maxwell do poorly in DX12 games? I haven't seen anything like that unless I missed something? feel free to direct me these benchmarks..

As long as both IHVs can produce high playable fps in future DX12 games regardless how it gets it done (there is simply no one "right" way to do things in any aspects of technology), its fine by me. This stuff is just pure nonsense unless we see the benches and products themselves. If AMD has the upper hand and I need to upgrade, then AMD for my next upgrade vice versa.

Til then, the talk of async compute, DX12 performance and what not doesn't have any real substance. Both seem to relatively perform well but we know there aren't native DX12 games out there yet (or is there? haven't really followed games inawhile) to fully test the API's capabilities. By that time, these new products will be out with higher performance.

It's the same AMD hype train. Async is a nice bonus for AMD no doubt but not some killer feature that by itself will drastically change how future games run.

Asynchronous compute granted a gain of 5-10% in performance on AMD cards, and unfortunately no gain on Nvidia cards, but the studio is working with the manufacturer to fix that. Theyll keep on trying.

The downside of using asynchronous compute is that its super-hard to tune, and putting too much workload on it can cause a loss in performance.

http://www.dualshockers.com/2016/03/15/directx-12-compared-against-directx-11-in-hitman-advanced-visual-effects-showcased/

Raising · Mar 24, 2016

Wasn't increased draw calls the most significant thing about dx12 ? Why all the fuss about async when it's just one of many dx12 features ?

Cookie Monster · Mar 25, 2016

Raising said:
Wasn't increased draw calls the most significant thing about dx12 ? Why all the fuss about async when it's just one of many dx12 features ?

Because according to some questionable posters that like to post things that they think they know about are parading the videocard forums that AC = DX12 and that if you somehow if one does not dont support this in the hardware its the end of the particular company? Sorry for my ignorance but this is the vibe I get from VC&G these days.

Yet I see benchmarks and see financials/marketshare, I keep seeing the opposite happening? From my small knowledge in GPUs, its a little more complicated than that and async compute is one part many of features in the DX12 API just like increased drawcalls, decreased overheads etc.

Regardless of how one IHV does it or not in their GPUs, as long as the performance is there its literally meaningless.

Bryf50 · Mar 25, 2016

It's one of the few features of DX12 that is a clear performance advantage. Is it the end of the world not having it? No. But it does mean that Nvidia has to work harder for that 5%-10% "free" performance from Async Compute. Just imagine if the Fury X was 5-10% faster overall. It would be game changing to the perception of the card at release.

Bacon1 · Mar 25, 2016

nsavop said:
It's the same AMD hype train. Async is a nice bonus for AMD no doubt but not some killer feature that by itself will drastically change how future games run.

http://www.dualshockers.com/2016/03/15/directx-12-compared-against-directx-11-in-hitman-advanced-visual-effects-showcased/

5-10% extra performance is very nice tbh, that just means higher FPS.

Also

Erenhardt · Mar 25, 2016

Raising said:
Wasn't increased draw calls the most significant thing about dx12 ? Why all the fuss about async when it's just one of many dx12 features ?

Both are very important. But everyone accepted that amd benefits heck of a lot more from increased drawcalls than nvidia. It is because of differences between their dx11 drivers. It is closely tied to CPU performance rather than GPU, so it is more of and CPU section thing.

Async compute can give lots of performance (10% and more). By video cards and graphics forum standards it means blowing out of the water/kicking ass/making smth obsolate/beating/spanking/trashing/ and all sorts of other funny phrases. It is a big thing apparently, and if nvidia is going crippled yet another generation, it is a big news.

What nv can do is to offer more hardware for the same price to keep up with GCN, which will benefit their customers.

Or they can go nvidia about it and block AC in new games by GW partnership, which will hurt all gamers (but amd side more

)

Nv not supporting AC is a big thing.

coercitiv · Mar 25, 2016

Cookie Monster said:
Because according to some questionable posters that like to post things that they think they know about are parading the videocard forums that AC = DX12 and that if you somehow if one does not dont support this in the hardware its the end of the particular company? Sorry for my ignorance but this is the vibe I get from VC&G these days.

Yet I see benchmarks and see financials/marketshare, I keep seeing the opposite happening? From my small knowledge in GPUs, its a little more complicated than that and async compute is one part many of features in the DX12 API just like increased drawcalls, decreased overheads etc.

Look at it the other way around: how come this feature with relatively low impact on performance managed to get such an allergic reaction from certain fans on the forum? People filled entire forum pages trying to prove AC is supported and will bring performance benefits to Maxwell as well. Only when it became painfully obvious that wasn't going to happen did the new rhetoric start - AC is a minor performance benefit that can be overcome with raw power.

All this talk is not about who will take the performance crown in 2016-2017, it's about admitting AMD has even a slight tiny narrow nano advantage over the competition. It's about each team trying to put their flag on the top of the mountain before the DX12 craze begins, even if months later, when the camera pans out, it will turn out to be a pathetic molehill.

thesmokingman · Mar 25, 2016

3DVagabond said:
Was there something prior to this that I missed?

Just seems like he's stating that Intel has superior support.

I did exactly that, state a simple point to which he replied with a diatribe about how irrelevant Intel is. It's like he's baiting me into hey look over here... lol. If you have to state how unbiased you are... chances are that you're trying to prove something. End of the day NV's CR support gives the same type of results that it's claimed AC support in hardware shows. I'm not sure that's much to brag about.

ThatBuzzkiller said:
Intel is irrelevant with the pathetically low primitive rate that they push their iGPUs out ...

If people think tessellation is bad on AMD wait till they see how much more poorly it scales on Intel GPUs ...

Fixed function units isn't exactly Intel's strength ...

TheRyuu · Mar 25, 2016

Bryf50 said:
It's one of the few features of DX12 that is a clear performance advantage. Is it the end of the world not having it? No. But it does mean that Nvidia has to work harder for that 5%-10% "free" performance from Async Compute. Just imagine if the Fury X was 5-10% faster overall. It would be game changing to the perception of the card at release.

It may be 5-10% for Fury X but I'm not sure you'd see the same gains on Nvidia cards even if it had hardware support for async compute. We have to keep in mind that because of how the cards are designed you see those async compute gains on AMD hardware because (IIRC) it's harder for AMD cards to reach their maximum throughput so they they have have the required overhead to see a gain from async compute (I can't find the source for this but I believe it's been linked to in one of these async threads).

As others have said async compute is not the only feature that DX12 offers to developers. Furthermore you can still see gains from using compute shaders even under with the non-optimal implementation for Maxwell so long as they they are not interleaved and you properly batch the jobs[1].

I look at it this way, it's not so much that Maxwell sucks at DX12 as it's just harder to see the gains from using it because it's harder to program for because of design decisions that Nvidia made wrt efficiency. Couple that with the fact that the best case for AMD is essentially the worst case for Nvidia[1] and the result is likely what's so far been observed with DX12 titles. I still don't think the sample size is large enough to draw conclusions about Nvidia's architecture in the DX12 era other than it's definitely more awkward to program for.

It may be worth noting that Maxwell2 (second gen maxwell) does support async compute although in a more limited fashion than what's required by the DX12 specs. Mahigan explained it well in another async compute thread[2] (the Doom one):

Mahigan said:
CUDA applications support Asynchronous compute via Hyper-Q and since PhysX is CUDA based, it supports Asynchronous compute + graphics on Maxwell (GM20x).

Hyper-Q isn't compatible with DX12 barriers and or fences (we're not sure which one) which is why GM20x doesn't support Async compute + graphics under DX12.

Hyper-Q bypasses the Command Processor in GM20x and is handled by a dedicated ARM processor on the GM20x die. This dedicated ARM processor can feed both 3D jobs and compute jobs concurrently and in parallel to GM20x.

It is more than likely that NVIDIA were caught by a minor DX12 API spec.

As for context switches, they occur in two stages.
1. During the execution of work loads.
2. Within the SMMs themselves.

The first can be alleviated by use of Hyper-Q in CUDA applications but the second cannot due to the shared L1 texture/compute caches within an SMM. Basically, an SMM cannot be performing both compute and texture jobs at once due to shared logic. A full flush is required to switch from one context to another within an SMM.

I do wonder if async compute could be supported without the need for the first context switch in the list above with modifications to just the ARM processor to support the barriers/fences or if a more invasive approach is required.

We can speculate that if Pascal shares similarities with Maxwell that it may indeed not be possible to alleviate the issue of hardware async but this is just speculation and I don't think it should be considered a deal breaker even if it's true. We just have to wait and see at this point.

[1] http://ext3h.makegames.de/DX12_Compute.html
[2] http://forums.anandtech.com/showthread.php?p=38115150#post38115150

iiiankiii · Mar 25, 2016

The reason why AC is important because it's in the consoles. That's it. Consoles will be using it. As a consequence, AMD will benefit indirectly from it. Nvidia, if it lacks AC hardware acceleration, won't benefit from it. It's pretty simple. Nvidia, like always, will find a way to stay competitive. They will leverage GameWorks so that Pascal will remain competitive. All other GPUs (Maxwell, Kepler and GCN) will suffer because of it. Believe that.

thesmokingman · Mar 25, 2016

iiiankiii said:
The reason why AC is important because it's in the consoles. That's it. Consoles will be using it. As a consequence, AMD will benefit indirectly from it. Nvidia, if it lacks AC hardware acceleration, won't benefit from it. It's pretty simple. Nvidia, like always, will find a way to stay competitive. They will leverage GameWorks so that Pascal will remain competitive. All other GPUs (Maxwell, Kepler and GCN) will suffer because of it. Believe that.

We don't have to believe it because they already do that at the consumers expense.

stuff_me_good · Mar 25, 2016

I just don't get this async or not async hate and blame war between both camps. AMD has async and nVidia doesn't, so what? Pascal will be powerful without it and not all games going to use it anyways so why waste your time here trying to brainwash everybody?

zlatan · Mar 25, 2016

I don't have too much time now to write a novel, but a lot of people oversimplify this async compute thing. In the reality there is no such thing in D3D12. The feature called multi-engine, and it is mandatory to support it, even if the hardware don't able to execute the selected pipelines in parallel. And the number of dedicated compute command engines are not a useful data without knowing the underlying architecture design.

dacostafilipe · Mar 25, 2016

I do think that AC is a nice feature, not only because of the possible performance gains, but also from a technical view.

But until we know if "real" Multi-engine support (aka overlapping shaders) would result in a performance increase on Maxwell, I would not call it a "fail" on nVidia.

Man, AMD's marketing did an nVidia here with the AC story and it worked xD

showb1z · Mar 25, 2016

NeoLuxembourg said:
I do think that AC is a nice feature, not only because of the possible performance gains, but also from a technical view.

But until we know if "real" Multi-engine support (aka overlapping shaders) would result in a performance increase on Maxwell, I would not call it a "fail" on nVidia.

Man, AMD's marketing did an nVidia here with the AC story and it worked xD

They brought it on themselves, no? Claiming they support AC. Telling Oxide they will fully implement it through a driver, 7 months ago.
Lying to developers and customers seems like a pretty big fail to me.

airfathaaaaa · Mar 25, 2016

Cookie Monster said:
Because according to some questionable posters that like to post things that they think they know about are parading the videocard forums that AC = DX12 and that if you somehow if one does not dont support this in the hardware its the end of the particular company? Sorry for my ignorance but this is the vibe I get from VC&G these days.

Yet I see benchmarks and see financials/marketshare, I keep seeing the opposite happening? From my small knowledge in GPUs, its a little more complicated than that and async compute is one part many of features in the DX12 API just like increased drawcalls, decreased overheads etc.

Regardless of how one IHV does it or not in their GPUs, as long as the performance is there its literally meaningless.

problem is most of the commands that dx12 incorporates doesnt seems to help at all any maxwell card not in the way they kept bragging about since 2014... http://www.legitreviews.com/nvidia-highlights-directx-12-strengths-amd_138178
fact is since we saw the revised roadmap that included pascal in the early 2015 (pascal didnt existed on any roadmap before that) we already knew they not gonna have anything ready

[bitsandchips]: Pascal to not have improved Async Compute over Maxwell

Diamond Member

Lifer

Senior member

Senior member

Senior member

Diamond Member

Golden Member

Lifer

Golden Member

Member

Member

Diamond Member

Golden Member

Diamond Member

Diamond Member

Diamond Member

Platinum Member

Diamond Member

Senior member

Platinum Member

Senior member

Senior member

Senior member

Senior member

Senior member