The reality is most designs can't power gate a core unless the device is in a suspend or sleep state.
That hasn't been true for a mobile SoC since Tegra 2 (and that was probably a big reason why it didn't get design wins in phones). If it worked like that there'd be no reason for them to include per-core power gating in the first place as opposed to just one big gate on every core. Power gating actually goes a finer grained than per-core too, for example SIMD units can be power gated.
For a while Android used the hotplug governor to power gate individual cores that have been idle for a while (and bring them back up if there's sufficient demand for a while). There have also been apps that manually gate individual cores. More recently power gating functionality has been moving into the cpuidle manager.
Lets just say I'm pretty certain. Granted its been a bit, but I've seen the numbers.
You'll understand if that's not very meaningful to me...
Can you point to an example where it doesn't or doesn't increase the power?
There are several examples of mobile SoCs where new versions came out that included more of the same type of cores, made on the same process, and clock speeds stayed the same or decreased. Some examples:
Snapdragon S3: single core Scorpion 1.5GHz -> S3: dual core Scorpion 1.7GHz
Snapdragon S4: dual core Krait 200 1.5GHz -> S3: quad core Krait 200 1.7GHz
Snapdragon S4: dual core Krait 300 1.7GHz -> S3: quad core Krait 300 1.9GHz
Snapdragon 618: 2x 1.8GHz Cortex-A72 + 4x 1.2GHz Cortex-A53 -> Snapdragon 620: 4x 1.8GHz Cortex-A72 + 4x 1.2GHz Cortex-A53
Tegra 2: dual core Cortex-A9 1.2GHz -> Tegra 3: 1.6GHz Cortex-A9 1.7GHz ("up to 1.7GHz in single core mode")
Exynos 4: dual core Cortex-A9 1.5GHz -> quad core Cortex-A9 1.6GHz
Exynos 5: dual core Cortex-A15 1.7GHz -> 4x Cortex-A15 2.0GHz + 4x Cortex-A7 1.3GHz
i.MX6 dual: dual core Cortex-A9 1.2GHz -> i.MX6 quad: quad core Cortex-A9 1.2GHz
(in this case the SoCs are the same other than the core count)
MT6571: dual core Cortex-A7 1.3GHz -> MT6589 quad core Cortex-A7 1.5GHz
MT6588: quad core Cortex-A7 1.7Ghz -> MT6592 octa core Cortex-A7 2GHz
MT6732: quad core Cortex-A53 1.5GHz -> MT6752: octa core Cortex-A53 1.5GHz
Z2480: dual core Saltwell 2GHz -> Z2580: quad core Saltwell 2GHz
Z3480: dual core Silvermont 2.13GHz -> Z3580: quad core Silvermont 2.33GHz
I don't have any power consumption numbers for how one SoC with higher core counts and cores disabled compares to another with lower counts, that are otherwise of similar/same design because I haven't seen anyone test for this. Not a lot of people really do good testing for power consumption to begin with. But suffice it to say that in the mobile SoC world core count has not been a limiter of clock speed in any case I know of.
Maybe things are different in the server world. Maybe the extra cores not being fused off really dp cost thermal headroom, which could be due to different design priorities. Or maybe there are other reasons that make them not as aggressive as possible with throttling. The server world is very different, here when you get an 12 core processor you're not going to be expected to only need to have one or two cores active very often. But on mobile devices this is the case much of the time, and it's critical that power consumption is minimized.