Nvidia's Performance Under Vulkan API Explored

Page 5 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

dogen1

Senior member
Oct 14, 2014
739
40
91
predictability might be the main thing. But

http://www.overclock.net/t/1606224/...-less-compute-parallelism-than-doom-aots-also
http://www.overclock.net/t/1606224/...llelism-than-doom-aots-also/180#post_25365469

What stands out to me is that the compute tasks wait for graphics tasks in the chart futuremark provided.

I doubt think it will transfer to actual games.

I don't consider myself knowledgeable enough to understand the implications of what we see in GPUView screenshots(or most people on this board and especially not on overclock.net). I do know that it's technically a D3D tool, and results from a Vulkan application might not even be accurate in the first place.
 

Unreal123

Senior member
Jul 27, 2016
223
71
101
Doom Open GL Vs Vulkan on GTX 980 TI Ultra settings at 1080p.

20160906232748_1.jpg

DOOMx64_2016_09_06_23_21_18_184.jpg
 
  • Like
Reactions: Sweepr

Azix

Golden Member
Apr 18, 2014
1,438
67
91
Comparisons should be made during action. Stills with nothing going on are more likely CPU limited even with high end CPUs I think.
 

MajinCry

Platinum Member
Jul 28, 2015
2,495
571
136
Guess the diehards are going to cling to NVidia having better drivers, even when they have demonstrably worse 'uns. Meh.

And nae, the driver can use shader intrinsics in lieu of the shipped shaders themselves. Emphasis on most GPU-based driver optimizations being developer-made replacements of game shaders, and that shader intrinsics are, well, developer-specific shader functions.

Wee gander: https://developer.nvidia.com/reading-between-threads-shader-intrinsics

Nothing stopping NVidia from replacing the shader code, seeing as how they're dealing with actual high-level code, and not undocumented bytecode a la AMD and Gameworks.
 
  • Like
Reactions: Gikaseixas

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
Doom Open GL Vs Vulkan on GTX 980 TI Ultra settings at 1080p.

That's the area that I benchmarked as well and got around a 150% increase. You'll see a much bigger difference between OpenGL and Vulkan if you point the camera towards the tower. The tower is a massive structure, so it requires lots of draw calls.

With OpenGL I was getting around 48-53 at 1440p max quality, and with Vulkan I was getting around 123-127 or thereabouts.

I don't know if the OpenGL renderer is just inherently weak or poorly optimized, because the frame rate drop in that area seemed to be caused by my CPU down clocking due to being too idle. It was down clocking to desktop speeds, which means 1.2ghz and very low voltage.

Before Vulkan was available, I would get around this issue by making my CPU run at 4.4ghz constantly by selecting the performance option in Windows..
 
Last edited:

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
Whilst there techically isn't any specific defined way to implement Async, and thus no right or wrong as far as the current implementations are concerned, there is certainly a better or worse implementation, with AMD obviously belonging to the first and Nvidia to the latter.

The truth is that neither one of us have anywhere near the depth of knowledge to provide an accurate answer as to whether AMD's or NVidia's asynchronous compute solution is better. This would require advanced knowledge of GPU architectures, as well as access to patented and confidential information.

Yes, AMD's implementation results in a larger performance increase, but how much die space and wattage did they sacrifice for these hardware ACEs compared to NVidia's implementation? You have to attempt to look at this from the perspective of an engineer ie tradeoffs, if you want to be unbiased.

And I don't see how you can possibly conclude that the lack of performance boost for Pascal from async is ID's fault and not Nvidia's.

I haven't concluded anything, as ID haven't even released the update to enable AC on Pascal yet. I actually have faith in ID as they have a history of top notch software engineering.

But since Time Spy does give about a 7.5% boost on a GTX 1080 with AC enabled, then I would at least expect something similar with Vulkan.
 

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
predictability might be the main thing. But

http://www.overclock.net/t/1606224/...-less-compute-parallelism-than-doom-aots-also
http://www.overclock.net/t/1606224/...llelism-than-doom-aots-also/180#post_25365469

What stands out to me is that the compute tasks wait for graphics tasks in the chart futuremark provided.

I doubt think it will transfer to actual games.

Right, as if anyone in that thread can even read that graph... :rolleyes:

People just want to believe what they want to believe. AMD fans were so against the notion that Pascal actually had concurrent asynchronous compute capability, that they came up with all sorts of conjecture that NVidia must be lying again.. And then Time Spy was released, and all of a sudden we had a method of gauging the performance of asynchronous compute enabled or disabled for Pascal, with Pascal evidently getting an increase with AC enabled as you would expect.

So now predictably, the benchmark itself is false because for some reason, only AMD is allowed to have concurrent asynchronous compute capability, let alone actually benefiting from it :D
 

Bacon1

Diamond Member
Feb 14, 2016
3,430
1,018
91
I haven't concluded anything, as ID haven't even released the update to enable AC on Pascal yet. I actually have faith in ID as they have a history of top notch software engineering.

1080 does appear to do work using async compute in doom: http://imgur.com/a/hbX8t

So there goes that theory.. guess async compute missing was driver related after all as well, along with the rest of the vulkan performance issues. Id had done the work and we were just waiting for Nvidia to release drivers...

Called it back in early august - https://forums.anandtech.com/thread...er-vulkan-api-explored.2482691/#post-38415169

Sounds like you guys all still owe id an apology for claiming they failed to support Nvidia cards, it was Nvidia themselves holding back their cards with lacking drivers.
 

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
^^ Nope!

Asynchronous compute is not enabled for Pascal in Doom. The developer already said as much on their website (I linked it a few pages back), that it was a work in progress which would eventually be implemented.

And none of the recent patches have mentioned AC for Pascal being turned on. So either you're reading the graph incorrectly, or the graph itself isn't showing accurate information..
 

zlatan

Senior member
Mar 15, 2011
580
291
136
In Doom async compute is vendor natural. It will run if the COMPUTE_BIT present in an independent queue.
But this won't help Pascal, because the post process overlap is a low latency implementation, and it will call the present from the COMPUTE_BIT. Pascal don't support this, so NV need a sync point first in the main queue with the GRAPHICS_BIT flag, and than they can do the present there. This will ad latency and minimize the performance gain. It doesn't really useful in the end, because you may gain 2-3% performance, but you get a lot ot additional latency with it.
 
  • Like
Reactions: krumme

Bacon1

Diamond Member
Feb 14, 2016
3,430
1,018
91
^^ Nope!

Asynchronous compute is not enabled for Pascal in Doom. The developer already said as much on their website (I linked it a few pages back), that it was a work in progress which would eventually be implemented.

If it wasn't enabled, the "Hardware Queue Compute_1" queue wouldn't exist at all.

The developer never said that it wasn't programmed for Pascal, just that it wasn't enabled yet.

Currently asynchronous compute is only supported on AMD GPUs and requires DOOM Vulkan supported drivers to run. We are working with NVIDIA to enable asynchronous compute in Vulkan on NVIDIA GPUs.

That could simply mean they are helping Nvidia with their drivers. We've seen that their drivers were lacking and now the compute queue shows up and has work in it.

I've previously shown that async compute is completely vendor neutral with code snippets from DX12. I haven't done any vulkan work but I don't see why it would be vendor specific there.

And none of the recent patches have mentioned AC for Pascal being turned on

Once again, it doesn't necessarily need a patch, just a driver that says it supports async compute, which is what appears to have happened.
 
  • Like
Reactions: Grazick