Instead of asynchronous shaders, Pascal uses a technique called pre-emption. Effectively, this enables the GPU to prioritise one set of more complex tasks over another (for example, preferencing compute tasks like physics over graphics). The trouble is, longrunning compute jobs can end up monopolising the GPU. This was a particular issue for Maxwell, where the GPU could only pre-empt tasks at the end of each command. That means extra time spent waiting for the command to end increasing latency.
Pascal implements pixel level pre-emption, allowing the GPU to pause smaller tasks at any point in order to save the status of them to memory while bigger tasks complete. It's an interesting solution, but it still doesn't replace the performance of hardware-based asynchronous shaders. Fortunately for Nvidia, even with the increasing number of DX12 games being released, few of them take full advantage of asynchronous shaders. Fewer still have shown any real improvement in performance over DX11.
That will change over time (spoiler: it does a little here too), but there's more work required on the developer side to support the low-level hardware features of DX12. Right now, most simply aren't bothering. That's not to mention that despite its lack of async, Nvidia has one very big advantage over the competition: clock speed.