Async Compute - Pascal launch demo (videos)

Status
Not open for further replies.

DamZe

Member
May 18, 2016
188
84
101
It doesn't seem to be implemented as seen on GCN nor as efficient, the preemption seems to be a sort of band aid approach (leave one task to work on another), rather than moving all the tasks into separate queues for faster processing, and does it bring the same performance gains as seen on the AMD cards? Sure the 1080 is fast in the few DX12 titles benchmarked, but how do we know that it is not just the super high clocks that are carrying it?
 
Last edited:

csbin

Senior member
Feb 4, 2013
904
605
136
81664.png



81658.png
 

Det0x

Golden Member
Sep 11, 2014
1,455
4,948
136
rDsebQe.png


http://www.bitsandchips.it/52-engli...scal-in-trouble-with-asyncronous-compute-code

Seems like they were semi correct 2 months ago :)

Pascal in trouble with Asynchronous Compute code

According to our sources, next GPU micro architecture Pascal from NVIDIA will be in trouble if it will have to heavly use Asynchronous Compute code in video games.

Broadly speaking, Pascal will be an improved version of Maxwell, especially about FP64 performances, but not about Asyncronous Compute performances. NVIDIA will bet on raw power, instead of Asynchronous Compute abilities. This means that Pascal cards will be highly dependent on driver optimizations and games developers kindness. So, GamesWorks optimizations will play a fundamental role in company strategy.

*edit*

I can't say to much on this topic, but Pascal will be an improvement over Maxwell especially at this feature. But no, it won't have GCN-like capabilities. It will be close to GCN 1.0, but nothing more.
There is 2 ace units in 7970 and 8 units in 290 as i recall (?)

Seems like he knew what he was talking about :)
 
Last edited:

Azix

Golden Member
Apr 18, 2014
1,438
67
91
it sounds like all they are doing is changing the task assigned on the fly. So instead of 50% stuck on compute and 50% stuck on graphics they can change 25% of the compute to graphics if the compute finishes early.

Better, but its probably just going to fix their loss in performance.

Is this purely driver side? Is this the "driver" we were waiting for? Sounds like it could be hardware.
 
Last edited:
Mar 10, 2006
11,715
2,012
126
it sounds like all they are doing is changing the task assigned on the fly. So instead of 50% stuck on compute and 50% stuck on graphics they can change 25% of the compute to graphics if the compute finishes early.

Better, but its probably just going to fix their loss in performance.

Is this purely driver side? Is this the "driver" we were waiting for?

Not driver side, this is in hardware.
 

thesmokingman

Platinum Member
May 6, 2010
2,302
231
106
So the new hotness is already architecturally dated?


If things pan out the way AMD is playing their cards, yea. Brand new hotness is barely faster in true DX12 games vs competitors last gen. It could be a problem when the vegans attack.
 

Udgnim

Diamond Member
Apr 16, 2008
3,680
124
106
Looks like Directx 12 is a performance hit. Just like Maxwell on titles with async.

http://arstechnica.com/gadgets/2016/05/nvidia-gtx-1080-review/2/

Instead of asynchronous shaders, Pascal uses a technique called pre-emption. Effectively, this enables the GPU to prioritise one set of more complex tasks over another (for example, preferencing compute tasks like physics over graphics). The trouble is, longrunning compute jobs can end up monopolising the GPU. This was a particular issue for Maxwell, where the GPU could only pre-empt tasks at the end of each command. That means extra time spent waiting for the command to end increasing latency.

Pascal implements pixel level pre-emption, allowing the GPU to pause smaller tasks at any point in order to save the status of them to memory while bigger tasks complete. It's an interesting solution, but it still doesn't replace the performance of hardware-based asynchronous shaders. Fortunately for Nvidia, even with the increasing number of DX12 games being released, few of them take full advantage of asynchronous shaders. Fewer still have shown any real improvement in performance over DX11.

That will change over time (spoiler: it does a little here too), but there's more work required on the developer side to support the low-level hardware features of DX12. Right now, most simply aren't bothering. That's not to mention that despite its lack of async, Nvidia has one very big advantage over the competition: clock speed.

TLDR: Pascal does not have hardware async support. It has pre-emption which allows it to better focus its time between compute & graphics processing, but it can't handle both simultaneously.
 

renderstate

Senior member
Apr 23, 2016
237
0
0
It's quite entertaining to see a bunch of fanboys that until yesterday didn't even know what preemption is *completely* misunderstanding (or pretending to?) NVIDIA presentation just to advance their agenda, after cherry picking some numbers here and there, just for confirmation bias.

There is no point having technical discussions on this forum. Beyond3D here I come.

Moderators: please close this thread. It's just a steaming pile of FUD and rubbish.
 

airfathaaaaa

Senior member
Feb 12, 2016
692
12
81
Typo :hmm:

OR are you alluding to the dawn of walking dead being upon us :'(

Might as well pack much more meat err I mean GPU horsepower as much as possible :D

"now you have a chance to win a vega 10 with every lettuce you buy"
D: :D
 

thesmokingman

Platinum Member
May 6, 2010
2,302
231
106
Vegans have been on my mind since my wife turned Vegan. Damn vegans, they ruin my meatatarian diet needs!
 

USER8000

Golden Member
Jun 23, 2012
1,542
780
136
It's quite entertaining to see a bunch of fanboys that until yesterday didn't even know what preemption is *completely* misunderstanding (or pretending to?) NVIDIA presentation just to advance their agenda, after cherry picking some numbers here and there, just for confirmation bias.

There is no point having technical discussions on this forum. Beyond3D here I come.

Moderators: please close this thread. It's just a steaming pile of FUD and rubbish.

Why are you raging?? You are also making a very gullible excuse - you really don't think fanboys don't exist on all forums - on your beloved Beyond3D have you gone and done a check to make sure that every poster is 100% neutral??

Have you even asked somebody to see if you have confirmation bias too?? Nobody can judge this unless observed by a third party.

This is the discussion on Beyond3D:

Isn't async compute simply the fact that a GPU can run compute shaders independently and asynchronously with graphics workloads?
If so, doing it inter-shader instead of intra-shader should be sufficient to meet that definition.
Nobody said that it has to be the most efficient or fastest implementation in existence. Similarly, nobody said that enabling async compute has to be faster than not enabling it: if a particular implementation is such that it can't find inefficiencies to exploit, then so be it.
Yes, cooperative scheduling is perfectly sufficient to fulfill the specification. Maxwell did that already, respectively you can do that on any hardware.

But the problem with Maxwell was that it would essentially flush the entire graphics pipeline, all SMMs and stall the command processors, in order to reconfigure the hardware for compute. That made the switch extremely expensive, as the GPU utilization suffers while the remaining draw calls complete, and the GPC isn't allowed to dispatch anything new.

The specs said nowhere that you had to gain anything from Async Compute, but that penalty should not have happened either.
Fine. That's Maxwell. So with Pascal, they're able to avoid this flush and reassign the SMs dynamically? That's a major improvement, right? So why the complaints? It's not perfect, it doesn't have the granularity of AMD. It's not the first time that there have been features that worked better for one vendor than the other.
So,bascially they are agreeing with what of some is being said here - Nvidia does not really gain from Async shaders with Pascal yet,but neither does it lose performance.

It also fits in with what Zlatan and Fottemberg were saying before,and even people on Beyond3D agree AMD is still stronger with regards to this ONE feature unless you think they are all pro-AMD.

But they also say it is only one feature which needs to be optimised for,which multiple devs have stayed regarding it,and there are other ways to also gain performance. Its not the be all and end all of all DX12 features.

AMD does not support conservative rasterization,etc still for example.
 
Last edited:

airfathaaaaa

Senior member
Feb 12, 2016
692
12
81
but the problem remains at 4k it does

i guess the card runs "out of steam" at that stage
 
Last edited:

Despoiler

Golden Member
Nov 10, 2007
1,968
773
136
It's quite entertaining to see a bunch of fanboys that until yesterday didn't even know what preemption is *completely* misunderstanding (or pretending to?) NVIDIA presentation just to advance their agenda, after cherry picking some numbers here and there, just for confirmation bias.

There is no point having technical discussions on this forum. Beyond3D here I come.

Moderators: please close this thread. It's just a steaming pile of FUD and rubbish.

Pascal Dynamic Load Balancing is driver based. The presenter says it in the async portion of presentation. He said the async compute graphics demo was done in DX11. Ahh wtf? Pascal preemption is hardware based.

One person on the forum confuses preemption with async, but now it's all of us? Ok guy!
 
Status
Not open for further replies.