The 2304SP RX 580 has 80% more shaders than the GTX 1060. It has more bandwidth. It has fewer ROPs. It has lower frequency. It performs the same as the GTX 1060, which tends to have more OC headroom.
RX 480 has 2304 shaders at a maximum boost clock of 1266 MHz. That means 5.83 peak TFlops.
GTX 1060 has 1280 shaders at a maximum boost clock of 1708 MHz. That comes out to 4.37 peak TFlops.
Assuming both cards are reasonably well balanced in terms of ROPs, bandwidth, etc., this means that Polaris needs about 33% more TFlops to get the same performance as Pascal. Or, looked at the other way, a Pascal card with about 75% of the TFlops of Polaris can provide equal performance.
We have reason to believe that the top Vega SKU will do 12.5 TFlops or more. That was the figure given for the Radeon Instinct MI25 accelerator, and it would be unusual for a professional card to have higher clocks than the consumer equivalent - usually it's the other way around. So if we assume Vega is just a scaled-up Polaris, no architectural improvements at all, that means it would be roughly equivalent to a Pascal card with ~9.4 TFlops. Yes, that's only about 5% more than GTX 1080 (8.87 TFlops) - but keep in mind that assumes no architectural improvements. And that's not the case - there are some big improvements with Vega, most notably tiled rendering, which was what vaulted Maxwell past GCN in perf/TFlop in the first place.
Using the same shader ratio, Vega at 4608 shaders would equal a GTX 1080, and have less OC headroom.
The problem with that argument is that Pascal almost certainly won't have nearly as much of a clock speed edge over Vega as it has over Polaris. AMD specifically stated that Vega was optimized for higher clock speeds. As mentioned above, the Radeon Instinct MI25 is said to do 12.5 TFlops, which, assuming 4096 shaders, equates to a clock speed of 1525 MHz.