I should have worded it differently, mythical benefits since there is no major performance difference, e.g. the 770/280x are neck and neck. If it was as wonderful as some pretend there would be a difference. To me it's merely a marketing bullet point until the cards start actually leading.
The GTX 770 and the 280x aren't what I'd call neck and neck. It's more appropriate to say they trade blows, with NVidia leading in some games and AMD in others.
One thing I believe though, is that NVidia has an advantage the more CPU bound a game is (plus the more threaded the engine is), and I think this is because of how well the drivers make use of multicore processors. As you ratchet up the resolution and AA though, things start to swing back in AMD's favor.
Take BF4 for instance. Until Mantle arrived, NVidia had completely caught up with AMD's early performance lead in the DX11.1 path (due to driver updates and patches), and was actually outperforming AMD; particularly in multiplayer.
Frostbite 3 engine can use as many as 8 threads, just like CryEngine 3. Those are the only two engines I'm aware of that can use that many threads, and NVidia has the edge in both games. Look at this graph:
NVidia has superior scaling at lower, less GPU bound resolutions than AMD. From the GTX 770, to the GTX 780, to the GTX 780 Ti, there is a
clear performance increase.
With AMD, there's barely any increase at all going from a 290 to a 290x. That's a CPU limitation.
Mantle is helping AMD level the playing field in this regard, but it's a shame they had to resort to such a drastic tactic as creating a whole new API, rather than just using more aggressive multicore optimizations in their drivers like NVidia has done.
Windows 8/8.1 helps NVidia even more, as the OS is fine tuned for CPU performance...