My point was that as long as AMD still lags significantly behind NVidia in the performance per watt category, they won't be able to catch up and reach parity. Take the recently released Titan Xp for example. Its' advertised Tflop rating is 12, but in reality it's closer to, or slightly above 13 due to boost. And this is in a 250w TDP envelope.
Thats fine, but its really just throw away words. What you need to look at is what causes high energy usage.
1. Data movement
2. pushing the voltage/clocks/circuit curve
3. executing things you don't need to
To hit 12.5 TFlops at 300w, AMD will likely require an AiO cooler a la Fury X, which just emphasizes what I'm saying.
Except the part where its stated that they are DC passively cooled parts, which means they have the shared cooling of the chassis fans .
I'll give credit to AMD though, they have certainly come a long way since Fiji when it comes to performance per watt, and with Vega, I hope they take it even further. But they will need much more to be on somewhat equal footing with NVidia.
So
1. If they actually increased GCN wavefront occupancy ( higher IPC) thats more performance for no power cost ( see GCN develop doc to understand how GCN executes data)
2. If they actually have refactored the pipelines to increase clocks that will be more performance for no power cost/same perf for less power cost
3. If they actually have the ROP cache in L2 that is
more performance and less power at the same clock
You then have the other stuff which we have no idea what/where it will be used, the TBR, primitive shaders , cache controller ( if it actually helps prefetching into GPU L2 etc that could save alot of power).
Look at what changed between Kepler and Maxwell and what didn't change on AMD's side, they refactored there "CUDA cores" for higher clocks and they refactored their ROP's for much higher efficiency, that's the point they pulled away from AMD. NV always had a geometry perf advantage which AMD finally fixed in Polaris.
I think a lot of the issue with Polaris was that people expected to see the things described in some of the new GCN based patents to be in it and none of them where, but given the much more explicit slideware for Vega it looks they will be in Vega, the question is which idea's and to what degree ( in Zen they idea's end up in more advanced form then what was described in the patents).
The proof will be in the pudding and we will have to wait and see. But just like the last 2 years in the Zen threads, when every troll was running with sandy bridge IPC and max 3.0ghz clock when you look at where the architectural issues were, you looked at the published information (compiler notes/slide ware etc) and you looked at the available patents there was no logic to that position. To me i see much of the same stuff going on here as well.
We will have to wait and see what GV,102,104,106,107,108,etc bring which could change this calculation but i think at this point its obvious that Gx100's are now a significant departure in terms of ALU design then the rest of the Geforce stack.