That's still 10 percent at most in the best-case scenario. Nothing like the 30 percent gains in Doom Vulkan with Shader Intrinsics and Async Compute on GCN.
Doom Vulkan remains the best implementation of these low-level APIs till date.
As far as I know, NVAPI does allow shader intrinsic functions, and this can be used by any OpenGL, DX11 and DX12 game. It's also in the process of being ported over to Vulkan according to NVidia. The reason why Doom Vulkan gives such a large performance gain on AMD hardware is because ID was able to use the shader intrinsics to leverage many of the optimizations found in the console versions; plus AMD's OpenGL performance is really subpar compared to NVidia's.
Also, the biggest gains from low level APIs typically come from the reduction in CPU overhead and dramatically improved parallel rendering. In one particular CPU intensive area in Doom, I saw a massive 150% increase in performance from using Vulkan over OpenGL.
Moreover, I think that Async compute only alleviates the CPU overhead issue on GCN cards in Gears of War 4.
Asynchronous compute either interleaves or executes in parallel compute and graphics commands. If anything, CPU overhead should be increased when you use asynchronous compute as the CPU is sending instructions to the GPU at a faster rate.