Pascal now supports DX12 resource binding tier 3 with latest drivers (384.76)

ElFenix · Jul 7, 2017

Quit being dicks. You two are talking past each other and even have large swathes of agreement but because you for some reason see each other as the opposition who must be destroyed you're incapable of seeing it.

AT Moderator ElFenix

Carfax83 said:
Seriously, get a clue. I actually own the hardware, and many of these games so I speak from firsthand experience. I've actually gained as much 150% or more with Doom Vulkan over OpenGL in the game's most CPU limited area, and the benchmarks will back up what I am saying.

Anyway, this is my last post addressing you because it's obvious you're just a troll.

Guru said:
Get your tone in order, especially when you are wrong on so many levels.

I own a GTX 1060 6GB and GTX 1080, but I'm not going from my personal numbers which actually confirm what I'm saying, I go by tests, there is literally a new thread right now about Hitman DX11 and DX12 test in actual gameplay, in it you can see AMD's card is on average about 5fps faster, while Nvidia 1060 is about the same average, with significantly lower frames in higher complexity scenes in DX12.

Again, check duderandom88, testinggames, artis, etc... on youtube for gameplay tests and you can see the 1060 6GB losing to the RX 580 8GB in pretty much all DX12 games, but what is more important Nvidia loses performance in most games when going to DX12 over DX11.

OpenGL was a crappy API, it's always performed terrible, usually 30%-50% worse than Microsoft's alternatives like DX11 or DX10, etc... so of course you are going to gain performance going to Vulkan over OpenGL on Nvidia, you will gain performance on an integrated Intel graphics in Vulkan over openGL.

The facts are Nvidia gains 1-4fps in SE4 on AVERAGE, sometimes up to 8-9fps more in DX12, but sometimes that much lower, while AMD gains 15-20fps in DX12 over DX11.

Hitman is a wash in gameplay, about same performance on Nvidia compared to DX11, in all other game I've seen and tested, like Tom Clancy, warhammer, Deus-EX, ROTTR, Battlefield 1, etc... Nvidia cards lose performance going to DX12 over DX11.

Alessio1989 · Jul 7, 2017

ThatBuzzkiller said:
It should be noted that Fermi does not have support for bindless resources and this shows in their OpenGL drivers too as they lack support for the extension. If Apple had decided to write Metal 2 drivers for Fermi it would most likely only support Argument Buffer Tier 1 since the maximum amount of resources that you can bind is 128 textures and 16 samplers which coincidentally matches the limits of D3D12 Resource Binding Tier 1 ....

It's not only about the full lack of indexable resources, also Terascale cannot directly manage CBv in a table like in all other architectures while the possibility of manage SRVs would also be quite ridiculous (8 SRVs if the documentation I have is correct)... Supporting Terascale architecture would have required a totally different API model just for those cards (eg: a full "DX11.0-style" binding just for those GPUs, it wouldn't add any benefit if they supported those GPUs).

Carfax83 · Jul 7, 2017

tamz_msc said:
The WD2 numbers clearly reflect the underutilization of the shader engines in Fiji, this isn't a driver issue, but an inherent limitation of GCN. You need to do more work to extract performance out of the extra 5 CUs per SE on Fiji compared to Hawaii. I bet that the R9 290X/390X would be in the same ballpark, perhaps even slightly ahead of Fiji. The same thing might be happening in Watch Dogs. So this may not be an issue due to threading. The usual underperformance of AMD GCN hardware on Ubisoft titles using AnvilNext is usually attributed to this. Plus as usual the high quantity of tessellation.

Watch Dogs 2 doesn't use the AnvilNext engine. It uses the Disrupt 2.0 engine. Also, GCN suffers from underutilization in many titles, but DX12 and Vulkan has ameliorated that problem big time, mostly due to much lower CPU overhead and more parallel rendering. Take Doom for instance:

RX480 gets a massive 30 FPS and greater boost in this particular area with Vulkan. In a more CPU limited area like the Argent Tower, the gain would be even greater. I know this because on my Titan Xp, I gain well over 150% performance at the Argent Tower because OpenGL only uses a single thread for rendering, compared to Vulkan's parallel dispatch.

In fact, at the Argent tower, my framerate will drop into the 40s using OpenGL because the load on my CPU is so low that it reverts to desktop speed, ie 1200mhz.

This being a CPU-bound test, it is to be expected that the GPU utilization would be low anyway.

Not necessarily. The high framerate would offset the low resolution.

Also Polaris has better normalized geometry performance, tessellation and color compression, among other things - that's why it can outperform Fiji.

I agree, but Fiji's problem is as you say a lack of proper shader utilization, and the main reason for that is because the DX11 driver is inefficient and is unable to properly utilize the CPU to feed the Fury X's massive shader array.

GCN truly shines when you have a game that is 50% geometry and 50% compute, which is why I'm looking forward to Sebbi's upcoming game Claybook, to see what it can do with GCN.

Game looks interesting, but compute is already used heavily in many games these days.

Carfax83 · Jul 7, 2017

TheELF said:
Huge big secret,don't tell anybody

That setting has been around for years. Who'd have thought that it could be responsible for NVidia's stellar performance in DX11 titles. I think that AMD's driver limits the amount of worker threads that a 3D engine can request to increase rendering performance likely because AMD's drivers have much higher overhead and too many threads will eat up the CPU, whereas NVidia's drivers do not have that problem and will use the CPU to the utmost.

Carfax83 · Jul 7, 2017

Guru said:
I own a GTX 1060 6GB and GTX 1080, but I'm not going from my personal numbers which actually confirm what I'm saying, I go by tests, there is literally a new thread right now about Hitman DX11 and DX12 test in actual gameplay, in it you can see AMD's card is on average about 5fps faster, while Nvidia 1060 is about the same average, with significantly lower frames in higher complexity scenes in DX12.

Funny you should mention Hitman, because with the 378.28 drivers, Hitman got a massive boost in DX12 for NVidia. HardOCP and some other websites tested this:

Again, check duderandom88, testinggames, artis, etc... on youtube for gameplay tests and you can see the 1060 6GB losing to the RX 580 8GB in pretty much all DX12 games, but what is more important Nvidia loses performance in most games when going to DX12 over DX11.

That's just one card. The GTX 1060 is more likely to be GPU bound than the more powerful cards in the NVidia lineup, so the performance gain is going to be less with DX12.

The facts are Nvidia gains 1-4fps in SE4 on AVERAGE, sometimes up to 8-9fps more in DX12, but sometimes that much lower, while AMD gains 15-20fps in DX12 over DX11.

Regardless of how much FPS AMD gains under DX12, NVidia is still ahead. As I've been saying, the reason why NVidia doesn't gain as much as AMD in DX12 is because their DX11 driver is so efficient:

Hitman is a wash in gameplay, about same performance on Nvidia compared to DX11, in all other game I've seen and tested, like Tom Clancy, warhammer, Deus-EX, ROTTR, Battlefield 1, etc... Nvidia cards lose performance going to DX12 over DX11.

Hitman gains performance with DX12 using the latest drivers as I've already shown. The Division does as well, with the latest patch 1.6.1. Deus Ex MD and BF1 both have terrible and substandard DX12 implementations and they also lose performance under DX12 for AMD, so it's not just NVidia.

RotTR's DX12 implementation is much better, but you only see it at CPU bound resolutions. At 1440p and up, it will be even with DX11.

Guru · Jul 7, 2017

Carfax83 said:
Funny you should mention Hitman, because with the 378.28 drivers, Hitman got a massive boost in DX12 for NVidia. HardOCP and some other websites tested this:

That's just one card. The GTX 1060 is more likely to be GPU bound than the more powerful cards in the NVidia lineup, so the performance gain is going to be less with DX12.

Regardless of how much FPS AMD gains under DX12, NVidia is still ahead. As I've been saying, the reason why NVidia doesn't gain as much as AMD in DX12 is because their DX11 driver is so efficient:

Hitman gains performance with DX12 using the latest drivers as I've already shown. The Division does as well, with the latest patch 1.6.1. Deus Ex MD and BF1 both have terrible and substandard DX12 implementations and they also lose performance under DX12 for AMD, so it's not just NVidia.

RotTR's DX12 implementation is much better, but you only see it at CPU bound resolutions. At 1440p and up, it will be even with DX11.

Apart from the German site you posted, they all test internal benches when they are available, so hardocp and guru3d test internal benches.

I've done the tests myself, there is no difference between the different versions if the division in terms of Nvidia DX12 performance, in game the game does lose performance on average.

It used to be much worse, now its become better, but it still does lose performance in actual gameplay.

As I mentioned the ONLY two games that Nvidia has better or equal performance to DX11 is SE4 and Hitman which you also posted as proof that it has good DX12 performance, when all other games it does lose performance.

I've yet to see any game where AMD loses performance, this was initially the case in Deus-Ex mankind, but I think it was the third patch or maybe even second that fixed all that.

AMD now gains significantly more performance in every single DX12 game over DX11. BF1 RX 400/500 series easily gain 10fps more and beat the Nvidia equivalent cards, even the RX 570 gives the GTX 1060 6GB a run for its money in BF1 at DX12.

Sniper Elite 4 the RX 570 is equal to the GTX 1060 6GB, this is a $170 card vs a $250 card at official prices of course, not deals or the mining craze now, so a difference of $80 and the RX 570 performs as good as the GTX 1060 6GB in SE4, BF1, Doom, etc...

Face2Face · Jul 7, 2017

SPBHM said:
those are just 3 games games from before the legacy days (2015 and lower)
and even that, on fallout 4 if you enable godays it's broken, but the video is running all disabled/low.
the number of bugs is a lot worse now.

Playing around with some DX12 games, but performance is much worse. Here's a video I did of ROTTR DX12 vs. DX11.

bononos · Jul 8, 2017

Guru said:
Apart from the German site you posted, they all test internal benches when they are available, so hardocp and guru3d test internal benches.

I've done the tests myself, there is no difference between the different versions if the division in terms of Nvidia DX12 performance, in game the game does lose performance on average.
.........
AMD now gains significantly more performance in every single DX12 game over DX11. BF1 RX 400/500 series easily gain 10fps more and beat the Nvidia equivalent cards, even the RX 570 gives the GTX 1060 6GB a run for its money in BF1 at DX12.
.........

What are internal benches? Are they hardocp/guru3d customized handmade suite of game benchmarks or it is using the built in game benchmarks like flybys?

tamz_msc · Jul 8, 2017

Carfax83 said:
Watch Dogs 2 doesn't use the AnvilNext engine. It uses the Disrupt 2.0 engine. Also, GCN suffers from underutilization in many titles, but DX12 and Vulkan has ameliorated that problem big time, mostly due to much lower CPU overhead and more parallel rendering. Take Doom for instance:

Why does every example of your's revolve around Ubisoft titles?

RX480 gets a massive 30 FPS and greater boost in this particular area with Vulkan. In a more CPU limited area like the Argent Tower, the gain would be even greater. I know this because on my Titan Xp, I gain well over 150% performance at the Argent Tower because OpenGL only uses a single thread for rendering, compared to Vulkan's parallel dispatch.

In fact, at the Argent tower, my framerate will drop into the 40s using OpenGL because the load on my CPU is so low that it reverts to desktop speed, ie 1200mhz.

OpenGL optimizations in AMD's driver have been pretty much non-existent. The reason why AMD gains so much in Doom Vulkan is the opposite of why according to you NVIDIA's driver gains so little in DX12 compared to DX11.

Not necessarily. The high framerate would offset the low resolution.

Where's the GPU utilization graphs?

I agree, but Fiji's problem is as you say a lack of proper shader utilization, and the main reason for that is because the DX11 driver is inefficient and is unable to properly utilize the CPU to feed the Fury X's massive shader array.

No, it was because it was a fundamentally unbalanced design. Hawaii was the most balanced high-end GCN chip.

Compare 390/390X and Fury/FuryX:

390X 9% faster than 390, while having 10% more SPs. Fury X only 5% faster than Fury, in spite of having 14% more SPs.

Here's what really low utilization looks like, ignore NVIDIA numbers.

390X is still 9% faster than 390, Fury X is only 2% ahead of Fury.

Different games, completely different results with same drivers.

ThatBuzzkiller · Jul 8, 2017

Alessio1989 said:
It's not only about the full lack of indexable resources, also Terascale cannot directly manage CBv in a table like in all other architectures while the possibility of manage SRVs would also be quite ridiculous (8 SRVs if the documentation I have is correct)... Supporting Terascale architecture would have required a totally different API model just for those cards (eg: a full "DX11.0-style" binding just for those GPUs, it wouldn't add any benefit if they supported those GPUs).

No, D3D12 binding model is backwards compatible with FL 11_0 hardware. Also Terascale doesn't need to manage the CBVs in a descriptor table, a descriptor table is technically just an array of descriptors which means that the CPU has to handle two indirections for resource binding rather one with a regular descriptor ...

D3D12 binding model is more flexible than you give it credit for and I'm not sure where you got that Terascale can only support 8 SRVs per shader stage if I think that's what you're implying when D3D11 spec mandates that all FL 11_0 capable devices must support a maximum of 128 slots for SRVs as defined with the limit 'D3D11_COMMONSHADER_INPUT_RESOURCE_REGISTER_COUNT (128)' ...

D3D11_COMMONSHADER_CONSTANT_BUFFER_HW_SLOT_COUNT (15) (This one corresponds with the 14 CBV limit since one is set aside for an immediate constant buffer for shaders.)

D3D11_COMMONSHADER_SAMPLER_SLOT_COUNT (16) (This one you can easily guess what this corresponds too.)

D3D11_PS_CS_UAV_REGISTER_COUNT (8) (This one is easy enough to figure out that all FL 11_0 hardware at least has a minimum of 8 slots for UAVs in the pixel shader and compute shader. Later 64 UAV slots for all shader stages with FL 11_1 hardware.)

I can understand AMD not wanting to manage two driver stacks but D3D12 Resource Binding Tier 1 limits EXACTLY matches D3D11 FL 11_0 resource limits so there's nothing stopping AMD from supporting Terascale on DX12. AMD could potentially make DX12 drivers for Evergreen microarchitecture, their very first DX11 capable hardware!

TheELF · Jul 8, 2017

Carfax83 said:
I think that AMD's driver limits the amount of worker threads that a 3D engine can request to increase rendering performance likely because AMD's drivers have much higher overhead and too many threads will eat up the CPU, whereas NVidia's drivers do not have that problem and will use the CPU to the utmost.

I think that's a bad term if AMD's driver is single threaded,nvidia in this example uses ~50% of the cpu,if it where single threaded it would be constricted to 25% max which would be less overhead not more.
But that's the whole idea behind Dx12 isn't it use more cores to do more work,since not everybody has cores at 7Ghz

(which my CPU would have to be able to hit to get this 50% workload done in just one thread)

Guru of course nvidia will loose performance in dx12 since no game yet has multicore rendering in dx12 while nvidia does have it in dx11...for a while now.

Oh and also...

In dx11 the driver consumes ~14% of the CPU time leaving the highest division thread at~12%
in dx12 the driver is gone leaving the highest division thread running at ~17%
= profit
Dx12 is made for the little man (consoles) never forget that,don't look at test done on super high core count high Ghz CPUs...

Alessio1989 · Jul 8, 2017

ThatBuzzkiller said:
No, D3D12 binding model is backwards compatible with FL 11_0 hardware. Also Terascale doesn't need to manage the CBVs in a descriptor table, a descriptor table is technically just an array of descriptors which means that the CPU has to handle two indirections for resource binding rather one with a regular descriptor ...

D3D12 binding model is more flexible than you give it credit for and I'm not sure where you got that Terascale can only support 8 SRVs per shader stage if I think that's what you're implying when D3D11 spec mandates that all FL 11_0 capable devices must support a maximum of 128 slots for SRVs as defined with the limit 'D3D11_COMMONSHADER_INPUT_RESOURCE_REGISTER_COUNT (128)' ...

D3D11_COMMONSHADER_CONSTANT_BUFFER_HW_SLOT_COUNT (15) (This one corresponds with the 14 CBV limit since one is set aside for an immediate constant buffer for shaders.)

D3D11_COMMONSHADER_SAMPLER_SLOT_COUNT (16) (This one you can easily guess what this corresponds too.)

D3D11_PS_CS_UAV_REGISTER_COUNT (8) (This one is easy enough to figure out that all FL 11_0 hardware at least has a minimum of 8 slots for UAVs in the pixel shader and compute shader. Later 64 UAV slots for all shader stages with FL 11_1 hardware.)

I can understand AMD not wanting to manage two driver stacks but D3D12 Resource Binding Tier 1 limits EXACTLY matches D3D11 FL 11_0 resource limits so there's nothing stopping AMD from supporting Terascale on DX12. AMD could potentially make DX12 drivers for Evergreen microarchitecture, their very first DX11 capable hardware!

Are you sure that Terascale would be able to manage CBV descriptor table without any additional change of the runtime? (ie more CPU overhead which is what D3D12 want to avoid)
I know there were some runtime changes during the preview period, like the increase Tier 1 descriptor heap size to 2^20 for Haswell....
For supporting Terascale we could have a "Tier 0" (the name is not casual) with 8 SRV/UAV table size, and no real CBV Tables or just see those additional limitation on current "tier1"...
Yes Terascale should not have issues on descriptor heap sizes or or heap space sharing.... But except for the 8 UAVs of Fermi (and those 2-3 things mostly related to Direct2D which Fermi and Kepler do not support), RB Tier1 is more related to D3D 11.1 instead of the the very first D3D 11 .0 model. I also confess to not be aware if Terascale need special care of RTVs or not (or any other type of "view" which where present in the D3D11 model and simply replaced with the Id3d12Resource interface).....

Anyway, I prefer the final solution adopted by the graphics advisor board instead of see and increased resource binding tier set (or a more limited Tier 1), especially considering the time required for the average game to correctly rewrote just the back-end for the Direct3D 12 model...

As for Fermi: is too late, the NSIGHT support has been removed and I am not sure how many developer will spend time and money to support a "deprecated" architecture (by development point of view) .
Of course is nice to see NVIDIA finally putting some love in the DX12 drivers..

SPBHM · Jul 9, 2017

Face2Face said:
Playing around with some DX12 games, but performance is much worse. Here's a video I did of ROTTR DX12 vs. DX11.

yes... it's not worth using DX12 for that (still I wonder if you use a seriously slow CPU with 4+ cores if it still helps?)

but I think the interesting bit is having compatibility with DX12 titles like here the GTX 465 running 2 titles that I think require DX12 (Halo 5 Forge and Forza Apex)
https://youtu.be/RwzWthN-q7E?t=2m15s

ThatBuzzkiller · Jul 9, 2017

Alessio1989 said:
Are you sure that Terascale would be able to manage CBV descriptor table without any additional change of the runtime? (ie more CPU overhead which is what D3D12 want to avoid)
I know there were some runtime changes during the preview period, like the increase Tier 1 descriptor heap size to 2^20 for Haswell....
For supporting Terascale we could have a "Tier 0" (the name is not casual) with 8 SRV/UAV table size, and no real CBV Tables or just see those additional limitation on current "tier1"...
Yes Terascale should not have issues on descriptor heap sizes or or heap space sharing.... But except for the 8 UAVs of Fermi (and those 2-3 things mostly related to Direct2D which Fermi and Kepler do not support), RB Tier1 is more related to D3D 11.1 instead of the the very first D3D 11 .0 model. I also confess to not be aware if Terascale need special care of RTVs or not (or any other type of "view" which where present in the D3D11 model and simply replaced with the Id3d12Resource interface).....

Anyway, I prefer the final solution adopted by the graphics advisor board instead of see and increased resource binding tier set (or a more limited Tier 1), especially considering the time required for the average game to correctly rewrote just the back-end for the Direct3D 12 model...

As for Fermi: is too late, the NSIGHT support has been removed and I am not sure how many developer will spend time and money to support a "deprecated" architecture (by development point of view) .
Of course is nice to see NVIDIA finally putting some love in the DX12 drivers..

Actually Terascale doesn't care whether or not it would be able to manage CBV descriptor tables. The purpose behind descriptors and the rest of the junk is to expose the wide variety of binding models on different hardware ...

Terascale doesn't do anything interesting with descriptor tables but descriptor heaps on the other hand are intended for Resource Binding Tier 1/2 capable devices such as FL 11_0 hardware like Terascale according to the design in the spec based on the fact that 'All heaps are visible to the CPU.' and that 'Descriptor heaps can only be edited immediately by the CPU, there is no option to edit a descriptor heap by the GPU.' so I've no doubt in my mind that the D3D12 resource binding doesn't work any differently from D3D11 resource binding under FL 11_0 hardware ... (The consequence of exposing descriptor heaps is that this limits fully bindless hardware like GCN from being able to create resource descriptors from the scalar units.)

There's no need to change the D3D12 spec to accommodate Terascale since that is already petitioned by Nvidia. Introducing descriptors doesn't change the way resource binding will work on older hardware and all of that mess with descriptors or other nonsense is handled on CPUs for FL 11_0 hardware ...

You don't need to care about render target views, index buffers, depth/stencil views or any of the other things either since these resources are already bound to the command list and not placed in the descriptor heaps ...

DmitryKo · Jul 9, 2017

nvgpu said:
It needs to be compiled as 64bit application.

Bacon1 said:
it's just querying the card features.

Code:

printf("%s : %u %s\n", "Total video memory", AdapterDesc.DedicatedVideoMemory + AdapterDesc.DedicatedSystemMemory + AdapterDesc.SharedSystemMemory, " bytes");

Yes, DXGI_ADAPTER_DESC2 uses a built-in integral type size_t to report available memory, and not some 64-bit integer type from the Windows SDK - so it's limited to 32-bit on x86 builds. I've recompiled the tool for the x64 platform and changed the format string to properly specify argument size in both 32-bit and 64-bit builds.

nvgpu said:
Feature Checker was updated to July 2017 version but can't download it though because B3D forums don't allow it unless you have account with certain amount of posts or whatever.

I did not realize new users are unable to download file attachments in the Beyond3D forum, sorry. It started there as a simple aid for programmers/engineers and testers/reviewers visiting that specific forum thread, so it's just convenient to keep posting updates there, but you can always post a question in the thread or start a private conversation with me!

Face2Face · Jul 10, 2017

SPBHM said:
yes... it's not worth using DX12 for that (still I wonder if you use a seriously slow CPU with 4+ cores if it still helps?)

but I think the interesting bit is having compatibility with DX12 titles like here the GTX 465 running 2 titles that I think require DX12 (Halo 5 Forge and Forza Apex)
https://youtu.be/RwzWthN-q7E?t=2m15s

I still haven't been able to get Forza 6 to work. It still says video card not supported. I did run the new Deus EX, and performance is worse, and frame pacing is a mess in DX12. I'm also seeing some graphical glitches and GPU usage below 99%

SPBHM · Jul 11, 2017

Face2Face said:
I still haven't been able to get Forza 6 to work. It still says video card not supported. I did run the new Deus EX, and performance is worse, and frame pacing is a mess in DX12. I'm also seeing some graphical glitches and GPU usage below 99%

yes the DX12 looks bad with the lower GPU usage and framerate, it's difficult to spot glitches due to video compression artifacts.
I wonder if future DX12 titles or patches will bother looking at Fermi, and how much Nvidia intends to support/improve the DX12 drivers.

interesting about Forza not working, you are not the first I see reporting this, but that other guy with the 465 got it working somehow...

bononos · Jul 11, 2017

SPBHM said:
yes the DX12 looks bad with the lower GPU usage and framerate, it's difficult to spot glitches due to video compression artifacts.
I wonder if future DX12 titles or patches will bother looking at Fermi, and how much Nvidia intends to support/improve the DX12 drivers.
...

It would be doubtful since Nvidia didn't issue an official announcement. I wonder why Nvidia bothered to enable dx12 on Fermi an old EOL product in the first place. Maybe they thought that it would be a net positive even if the performance was below dx11.

TheELF · Jul 11, 2017

bononos said:
It would be doubtful since Nvidia didn't issue an official announcement. I wonder why Nvidia bothered to enable dx12 on Fermi an old EOL product in the first place. Maybe they thought that it would be a net positive even if the performance was below dx11.

Performance is below dx11 only because you are running it on a i7 where the dx11 driver thread doesn't interfere with running the game/bench try it again on 2 or 4 cores and you will see the benefit.

Face2Face · Jul 11, 2017

TheELF said:
Performance is below dx11 only because you are running it on a i7 where the dx11 driver thread doesn't interfere with running the game/bench try it again on 2 or 4 cores and you will see the benefit.

I thought about disabling some cores and hyper threading, but It just so happens that I have a Celeron G1610 to test this out. I also have my GTX 470 installed, and have a second one to see how SLI does if I can get it working.

Spjut · Jul 11, 2017

Face2Face said:
I thought about disabling some cores and hyper threading, but It just so happens that I have a Celeron G1610 to test this out. I also have my GTX 470 installed, and have a second one to see how SLI does if I can get it working.

Could you try Dolphin as well?

TheELF · Jul 12, 2017

nvoglxxxxx=nvidia open GL
loose the driver thread = gain performance on the rest of the threads

Samwell · Jul 12, 2017

Face2Face said:
I still haven't been able to get Forza 6 to work. It still says video card not supported. I did run the new Deus EX, and performance is worse, and frame pacing is a mess in DX12. I'm also seeing some graphical glitches and GPU usage below 99%

Forza 6 can't run on Fermi. It needs resource binding tier 2. At the moment it's the DX12 game with the highest feature requirements. https://support.xbox.com/en-GB/games/troubleshooting/error-code-ap203

Guru · Jul 12, 2017

TheELF said:
Performance is below dx11 only because you are running it on a i7 where the dx11 driver thread doesn't interfere with running the game/bench try it again on 2 or 4 cores and you will see the benefit.

No, its below because it was not designed to support DX12. Sure you can patch in some things through software, but its obvious that the performance is going to be dismal, since the architecture is not optimized for DX12 API, those cards are actually DX10 cards optimized.

The support is basically just to enable people who haven't upgraded for 5+ years to be able to start DX12 and maybe be able to sort of play through them at all low setting with 20fps.

Spjut · Jul 12, 2017

TheELF said:
nvoglxxxxx=nvidia open GL
loose the driver thread = gain performance on the rest of the threads

So a decent improvement for Dolphin at least. It's a pity their people decided to drop DX12 support. I'd guess it's possible Fermi could have gotten a few additional improvements if someone spent the time writing a special path for Fermi.

Pascal now supports DX12 resource binding tier 3 with latest drivers (384.76)

Elite Member

Member

Diamond Member

Diamond Member

Diamond Member

Senior member

Diamond Member

Diamond Member

Diamond Member

Golden Member

Diamond Member

Member

Diamond Member

Golden Member

Junior Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Senior member

Diamond Member

Senior member

Senior member

Senior member