Ideally Steamroller / Excavator designs would have six units which would be clocked higher.
That would be far from being ideal, at 33% higher frequency they would have the same throughput as with 8 units but at the expense of 33% more GPU TDP that in this latter case, that is 25% lower perf/watt.
On the other hand getting to 16 CUs would theoricaly allow to double the perf/watt (of the GPU) at equal throughput in respect of a 8 CUs solution.
This wouldnt had made sense in Kaveri wich has not separate supplies for the GPU and CPU, not that it s implemented in Bristol Ridge but in its case there would be some advantages.