Pascal now supports DX12 resource binding tier 3 with latest drivers (384.76)

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Krteq

Senior member
May 22, 2015
991
671
136
Because Microsoft changed the final resource binding specs too late. There was a lot of talk about what specs would be useful for the future, but the IHVs had to develop an implementation for the release of the D3D12 API. NVIDIA just unable to react for the final changes, and these opens the door for some emulation. Their hardwares still not fully bindless, but they can emulate the TIER_3 support with some overhead on the CPU side. But to do this, they need to write the emulation layer between the driver and the hardware, and this required a lot of code change on their original implementation.
THX for explanation Zlatan

Can you somehow confirm that Paxwell cards have support for resource binding Tier 3 implemented now?
 

ThatBuzzkiller

Golden Member
Nov 14, 2014
1,120
260
136
SM6.1 is not a big change compared to SM6.0. It's just some intrinsic instruction to read the barycentric coordinates from the hardware rasterizer. While it will require a lot of work on the software side for NV and Intel, but the hardwares can support it. AMD already do manual interpolation, so they can ad this feature very easily.

Shader model 6.1 also brings in support for SV_ViewID ...

Microsoft should focus on getting trinary ops and cube intrinsics as optionals for shm6.2 ...
 

ThatBuzzkiller

Golden Member
Nov 14, 2014
1,120
260
136
Because Microsoft changed the final resource binding specs too late. There was a lot of talk about what specs would be useful for the future, but the IHVs had to develop an implementation for the release of the D3D12 API. NVIDIA just unable to react for the final changes, and these opens the door for some emulation. Their hardwares still not fully bindless, but they can emulate the TIER_3 support with some overhead on the CPU side. But to do this, they need to write the emulation layer between the driver and the hardware, and this required a lot of code change on their original implementation.

If Nvidia tried that it would eat up their registers like crazy ...

The only reason why fully bindless works efficiently as it does in GCN is because the hardware assumes that you won't be dynamically indexing the resource descriptors in a divergent fashion by storing as much as either 8 or 16 registers in the SGPRs ...

Nvidia to my knowledge does not have a scalar unit like GCN to store the resource descriptors so they'd have to store (8/16 registers * active warp count) if they were doing dynamic and divergent indexing ... (it can be as much as 1024 32-bit worth of registers they'd have to store since Pascal can have a maximum of up to 64 warps!)
 

Krteq

Senior member
May 22, 2015
991
671
136
The only reason why fully bindless works efficiently as it does in GCN is because the hardware assumes that you won't be dynamically indexing the resource descriptors in a divergent fashion by storing as much as either 8 or 16 registers in the SGPRs ...

Nvidia to my knowledge does not have a scalar unit like GCN to store the resource descriptors so they'd have to store (8/16 registers * active warp count) if they were doing dynamic and divergent indexing ... (it can be as much as 1024 32-bit worth of registers they'd have to store since Pascal can have a maximum of up to 64 warps!)
This is exactly what I was talking about in context of hardware limitations :)
 

Guru

Senior member
May 5, 2017
830
361
106
Games still have to be programmed to take advantage of the DX12 resource binding model, but apparently it's a big deal for GPU performance in DX12. The first game to use the DX12 binding model might be AC Origins, which I base on this slide:

XwPvoi.jpg


Here's an excellent video about DX12 resource binding model. It's very technical, but it drives the point home:


Yeah, my question was more about performance rather than just the 'feature' in terms of support, but no actual performance gains.

So far AMD seems to be winning hard with the low level API's, so I just wonder if this will do anything to help Nvidia's performance or is it just another fake support, that has no real performance benefit.
 

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
How are drivers not legit when GPU chip designers are the only ones able to access the firmware and the bios on windows ?

What I meant was, that the coincidence of Fermi now getting DX12 support and Pascal and Maxwell now getting resource binding tier 3 support is just too great to chock it down to a driver error or mistake.

To me it's apparent that NVidia has been doing some serious driver overhaul recently, as the amount of DX12 games increases. The question is, whether it's possible to upgrade the resource binding tier through emulation as Zlatan suggests

Whether driver giving out correct info or not is another matter entirely much like how earlier Nvidia drivers reported resource heap tier 2 supported when it was later fixed to show tier 1 supported ...

We won't know for sure that Maxwell and Pascal supports fully bindless unless we do some more testing by creating a test app that enumerates a D3D12 device with resource binding tier 3 that does some simple bound resources to a pipeline without crashing ...

I guess we'll have to see. I hope an NVidia rep responds to this so we can finally put the issue to rest.

If Kepler too reports fully bindless as being supported as well then it's almost certainly a driver mistake ...

PCGH.de is reporting Kepler as having tier 2. But they also state that Fermi is tier 3, which is certainly a mistake based on what the DX12 checker application is reporting.
 

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
Do you happen to know where the tools that some reviewers use for bandwidth tests and such are on those forums? I've looked in the past and couldn't find them

Bandwidth tests? You mean GPU bandwidth or CPU bandwidth?
 

zlatan

Senior member
Mar 15, 2011
580
291
136
If Nvidia tried that it would eat up their registers like crazy ...

The only reason why fully bindless works efficiently as it does in GCN is because the hardware assumes that you won't be dynamically indexing the resource descriptors in a divergent fashion by storing as much as either 8 or 16 registers in the SGPRs ...

Nvidia to my knowledge does not have a scalar unit like GCN to store the resource descriptors so they'd have to store (8/16 registers * active warp count) if they were doing dynamic and divergent indexing ... (it can be as much as 1024 32-bit worth of registers they'd have to store since Pascal can have a maximum of up to 64 warps!)
Yes, this solution obviously has it's own limitations. For efficient support they need new hardwares.
 

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
Yes, this solution obviously has it's own limitations. For efficient support they need new hardwares.

I doubt it's being emulated, as from what buzzkiller says, it would likely be disastrous for performance. It's probably either a driver error, or completely legit. I asked an NVidia rep on Geforce forums to comment on it, so we'll see what he says.
 

Bacon1

Diamond Member
Feb 14, 2016
3,430
1,018
91
What I meant was, that the coincidence of Fermi now getting DX12 support and Pascal and Maxwell now getting resource binding tier 3 support is just too great to chock it down to a driver error or mistake.

FYI Fermi's support looks pretty awful from actual testing:

I can confirm that D3D12 works in some games.

However it runs much worse than D3D11.

Deus Ex: MD Average Minimum Maximum
D3D11 54.7 FPS 43.3 FPS 68.3 FPS
D3D12 41.8 FPS 34.0 FPS 52.6 FPS
What those results don't show is that D3D12 was also stuttering badly.

The Hitman benchmark would not complete in either D3D11 or D3D12, it kept crashing about halfway through the test.

However the framerate up to that point was halved in D3D12, averaging 80 FPS in D3D11, and 40 FPS in D3D12.



https://www.reddit.com/r/hardware/c...tly_rolls_out_dx12_support_for_fermi/djnf0xz/

This guy got a whole 2-3 fps avg in Timespy with his GTX 470:

https://www.reddit.com/r/hardware/c...tly_rolls_out_dx12_support_for_fermi/djmdnjn/
 

bystander36

Diamond Member
Apr 1, 2013
5,154
132
106
Yeah, my question was more about performance rather than just the 'feature' in terms of support, but no actual performance gains.

So far AMD seems to be winning hard with the low level API's, so I just wonder if this will do anything to help Nvidia's performance or is it just another fake support, that has no real performance benefit.

Having support at the driver level allows all these features to be used in games now, then add better hardware support with future GPU's. Now they can see how it'll be used, and take advantage of the feature in future designs.
 

Guru

Senior member
May 5, 2017
830
361
106
Having support at the driver level allows all these features to be used in games now, then add better hardware support with future GPU's. Now they can see how it'll be used, and take advantage of the feature in future designs.

Thanks for the answer. I guess that is all cool for Nvidia, but In terms of actual gains in current hardware and future games its worthless. I guess those buying Volta and its derivatives will get better DX12 performance.

AMD is miles ahead in terms of low level API architecture, I've yet to see any DX12 game that Nvidia is winning, for example the 1060 6gb losses all the time against the RX 580 at stock settings, with 480 its closer, but with the 580 its no competition.

The division in actual gameplay the 580 can be up to 10fps faster, in ROTTR in gameplay it can be up to 7-8fps faster, in Deus-Ex up to 12fps faster, games like GOW3 are similar, but again in actual gameplay the 580 can be up to 7-8fps faster.

And in their 1080/1080ti the DX12 is always slower than DX11.
 

Spjut

Senior member
Apr 9, 2011
928
149
106
FYI Fermi's support looks pretty awful from actual testing:

https://www.reddit.com/r/hardware/c...tly_rolls_out_dx12_support_for_fermi/djnf0xz/

This guy got a whole 2-3 fps avg in Timespy with his GTX 470:

https://www.reddit.com/r/hardware/c...tly_rolls_out_dx12_support_for_fermi/djmdnjn/

The driver itself might not be that optimized, but I'd also guess that Fermi is at a disadvantage due to the DX12 games only having gotten optimized paths for Kepler and later
 

bystander36

Diamond Member
Apr 1, 2013
5,154
132
106
Thanks for the answer. I guess that is all cool for Nvidia, but In terms of actual gains in current hardware and future games its worthless. I guess those buying Volta and its derivatives will get better DX12 performance.

AMD is miles ahead in terms of low level API architecture, I've yet to see any DX12 game that Nvidia is winning, for example the 1060 6gb losses all the time against the RX 580 at stock settings, with 480 its closer, but with the 580 its no competition.

The division in actual gameplay the 580 can be up to 10fps faster, in ROTTR in gameplay it can be up to 7-8fps faster, in Deus-Ex up to 12fps faster, games like GOW3 are similar, but again in actual gameplay the 580 can be up to 7-8fps faster.

And in their 1080/1080ti the DX12 is always slower than DX11.

I don't really know how to take this concept of Nvidia losing in DX12, when they have 4 cards which are faster than anything AMD has in DX12 (Titan Xp, 1080ti, 1080 and 1070). While the 1060 and lower doesn't do as well in comparison to AMD's offerings, they still have several cards that perform better. It is clear that AMD's offering does better in DX12 in relation to DX11, but they aren't winning in DX12 as far as I can see.
 

Guru

Senior member
May 5, 2017
830
361
106
I don't really know how to take this concept of Nvidia losing in DX12, when they have 4 cards which are faster than anything AMD has in DX12 (Titan Xp, 1080ti, 1080 and 1070). While the 1060 and lower doesn't do as well in comparison to AMD's offerings, they still have several cards that perform better. It is clear that AMD's offering does better in DX12 in relation to DX11, but they aren't winning in DX12 as far as I can see.
Well their 1070, 1080, 1080ti lose performance under DX12 in pretty much all games compared to DX11 renderer. Even some of the internal game benches that they do okay at, seem misleading as in actual gameplay the fps difference is hugely in favor of DX11 over DX12, while AMD cards offer much better performance under DX12 and Vulkan, even in DX11 optimized games like the Division where they go head to head with Nvidia in the DX11 api, but take the lead easily under DX12.

Sniper Elite 4 which has the most advanced DX12 implementation to date sees the Nvidia products gain between 1-5fps, while AMD cards gain from 5 to 15fps more. Heck the RX 580, especially overclocked ones come very close to 1070 in that game. So a $240 card competing with a $350 one in certain games under DX12.
 

Bacon1

Diamond Member
Feb 14, 2016
3,430
1,018
91
https://forum.beyond3d.com/threads/direct3d-feature-levels-discussion.56575/page-9#post-1840641

Feature Checker was updated to July 2017 version but can't download it though because B3D forums don't allow it unless you have account with certain amount of posts or whatever.

https://ufile.io/kivae

---------------------------
Checksum information
---------------------------
Name: D3D12CheckFeatureSupport-july-2.zip
Size: 31934 bytes (0 MB)

CRC32: B424633D

CRC64: 7A7AAFF16CBA61B7

SHA256: 7D8B6041DEF1C80B88D8DE8872DF16B74BB7957AE3B2CD86880DE3991B3ED256

SHA1: 2A707B583A982293DF4D56E4F3AD92E2BDA478E5

BLAKE2sp: 0D1588B0B9D2562B9FA6CE5014894D3B8F9E19360E603CEB857C2B597044001D

haven't used that file host before so there is a few hashs to compare against to make sure they didn't tamper with it.
 
  • Like
Reactions: Carfax83

Bacon1

Diamond Member
Feb 14, 2016
3,430
1,018
91
Sniper Elite 4 which has the most advanced DX12 implementation to date sees the Nvidia products gain between 1-5fps, while AMD cards gain from 5 to 15fps more. Heck the RX 580, especially overclocked ones come very close to 1070 in that game. So a $240 card competing with a $350 one in certain games under DX12.

Yeah the optimization in that game is amazing, runs great in 4k... MGPU works very well as well, a 480+580 can reach a 1080 Ti (http://media.gamersnexus.net/images/media/2017/GPUs/vega/fe/charts/vega-fe-sniper-4k.png)
 

SPBHM

Diamond Member
Sep 12, 2012
5,056
409
126
Fermi supporting DX12 and WDDM 2.2 is pretty amazing, even if performance is not optimal the fact it's compatible is great, it took them like 2 years over what they promised I think but it was delivered

meanwhile my DX11 Radeon can't even play lots of DX11 games properly due to lack of support with its 2015 drivers.
 
  • Like
Reactions: psolord

psolord

Golden Member
Sep 16, 2009
1,913
1,192
136
Fermi supporting DX12 and WDDM 2.2 is pretty amazing, even if performance is not optimal the fact it's compatible is great, it took them like 2 years over what they promised I think but it was delivered

meanwhile my DX11 Radeon can't even play lots of DX11 games properly due to lack of support with its 2015 drivers.

Same here. My 5850 is running on fumes for years. Mass effect Andromeda does not even run. Thankfully quite a few games still do run.

And before anyone bites my head off for mentioning such an old card, would it be such a big financial burden if they kept it on a one driver per six months update cycle?

Especially the Evergreen series which put Nvidia to shame.

It's not only a support issue, it's mostly a politics issue and the face the company shows to the world.

I am too sad that my 570 broke down and didn't manage to fix it with two bakings and capacitor changes. :( It would be quite fun to do some vintage testing.