Mahigan
Senior member
- Aug 22, 2015
- 573
- 0
- 0
There's also the game code which occupies slightly more CPU time under DX12 than DX11. The DX Runtime and DX Driver is more evenly distributed across the available cores thus reducing the load on the primary CPU thread.
That's when comparing AMD GCNs DX12 vs DX11.
For Maxwell/Kepler, the DX Driver was already spread amongst the available cores (hidden driver threads). So the increase in CPU time for the game code, increases the overall frame time by a tiny bit depending on how well the engine is threaded.
This would show up as an increase in frame time which translates into a loss of FPS.
If you add Asynchronous Compute + Graphics into the mix then you also add fences, synchronization points, in between the compute and graphics contexts. This creates a tiny pause, GPU execution idle time, which translates into added frame time. For AMD GCN, the ability to execute both Graphics and Compute contexts in parallel makes up for these tiny pauses caused by a fence. For NVIDIA Maxwell/Kepler, these tiny pauses lead to a loss in FPS. That's why you lose FPS with Asynchronous compute + Graphics enabled on NVIDIA hardware.
To get an idea of what this all looks like on AMD GCN hardware...
CPU Side:
GPU Side:
For NVIDIA hardware, the DX Driver is split amongst cores for DX11. On top of this you have the Static Scheduler now becoming a factor as it takes CPU time from the primary thread (something AMD don't have to contend with due to the hardware scheduling nature of their architecture).
If your CPU was already fast enough to feed your GPU under DX11, then an increase in parallelism won't help because your GPU is already maxed out (the bottleneck). The added latency from ensuring the order of operations being submitted by the various CPU cores is optimized for the GPU ( done in software by the Static Scheduling on NVIDIA hardware), would show up. So you get a tiny drop in performance going from DX11 to DX12.
That's when comparing AMD GCNs DX12 vs DX11.
For Maxwell/Kepler, the DX Driver was already spread amongst the available cores (hidden driver threads). So the increase in CPU time for the game code, increases the overall frame time by a tiny bit depending on how well the engine is threaded.
This would show up as an increase in frame time which translates into a loss of FPS.
If you add Asynchronous Compute + Graphics into the mix then you also add fences, synchronization points, in between the compute and graphics contexts. This creates a tiny pause, GPU execution idle time, which translates into added frame time. For AMD GCN, the ability to execute both Graphics and Compute contexts in parallel makes up for these tiny pauses caused by a fence. For NVIDIA Maxwell/Kepler, these tiny pauses lead to a loss in FPS. That's why you lose FPS with Asynchronous compute + Graphics enabled on NVIDIA hardware.
To get an idea of what this all looks like on AMD GCN hardware...
CPU Side:



GPU Side:

For NVIDIA hardware, the DX Driver is split amongst cores for DX11. On top of this you have the Static Scheduler now becoming a factor as it takes CPU time from the primary thread (something AMD don't have to contend with due to the hardware scheduling nature of their architecture).
If your CPU was already fast enough to feed your GPU under DX11, then an increase in parallelism won't help because your GPU is already maxed out (the bottleneck). The added latency from ensuring the order of operations being submitted by the various CPU cores is optimized for the GPU ( done in software by the Static Scheduling on NVIDIA hardware), would show up. So you get a tiny drop in performance going from DX11 to DX12.
Last edited: