Its about optimization and you know it. There is no single reason DX12 should be slower than DX11 unless its due to lack of optimization. And its no surprise, DX12 is new, DX11 is optimized like mad. Its no different when we got DX10, DX9 was faster and better until developers learned to use it. However this time it requires more than learning, it requires extra time and money as well. Success? Very questionable. DX12 is one of those cases where you wish there was a GPU monopoly and not 3 IHVs.
Then you can try excuse it with whatever you wish. Async have been the only sole way to increase performance, but at the expense of power consumption. As much as up to +100W. Not to mention one epic disaster and fop of a game after the other due to development failure.
Nothing to do with optimizations.
CPU overhead mate. NVIDIA have higher CPU overhead. It has nothing to do with optimizations. NVIDIA's current architectures suffer a higher CPU overhead than AMDs under the DX12 API than they did under DX11. AMD suffered this under the DX11 API.
Basically, NVIDIA, when coupled with a fast CPU, are able to achieve 100% of their GPUs performance under DX11. NVIDIA driver have hidden driver threads when operating under DX11. These driver threads are able to be used to schedule and translate work for the GPU. Basically they're hidden submission threads. This allows NVIDIA GPUs to achieve their full potential with a strong CPU under DX11 because the GPU suffers less idle time. These threads offset the extra CPU cycles consumed by NVIDIA's static scheduler.
In DX12, there are no hidden driver threads. Each CPU core (thread for SMT cases) can record, translate and submit work to the GPU. That's a lot of parallel command lists. For NVIDIA, their Static Scheduler has to schedule work submitted by each thread. So NVIDIA's static scheduler is taking up more CPU time, under DX12, than it does under DX11.
So on a very fast CPU, that extra time translates into less performance relative to DX11.
On slower CPUs, the extra parallelism of DX12 buys performance increases because of the alleviation of a CPU bottleneck relative to DX11.
The end result is a net loss of performance, for NVIDIA hardware, going from DX11 to DX12 when running a strong CPU. This is across the board. Any game. Provided that the CPU was strong enough to feed NVIDIAs GPUs with work under DX11.
AMD don't have this issue because.
1. Their DX11 driver wasn't as refined as NVIDIA's in terms of multi threading support.
2. AMDs hardware scheduling handles the scheduling tasks on the GPU thus not taking away CPU cycles.
We've only recently seen AMD make an effort with its DX11 drivers in terms of multi-threading support. That's why AMD has great performance in recent DX11 titles.