The done-before argument is not really valid in this case, because you design SoCs for a certain purpose. If the purpose is to put it in a Phone, you would never even remotely consider 1.3V Vcc - in fact you power supply unit would not even support these voltages and required currents.
But as you can read in the article linked by Naukkis if you give the Cortex A72 enough voltage you will pass 4GHz. You dont have to make a chip though, you just need to sign-off the timing (with EDA tools like PrimeTime). In what product would you put a 4GHz Cortex A72 anyway? Remember power goes up by a factor of 4 roughly. (1.2/0.75)^2*(4/2.5) = 4)
Cortex A72 frequency-voltage curve
That's perfect post regarding power consumption. Thanks for that.
Some people has difficulty with understanding the difference between P3 and P4 pipeline and so affecting frequencies (f/V curve). Me as mechanical engineer I have to use always electric->hydraulic analogy. This helps me to understand and imagine the electric processes inside (voltage is pressure, current is mass flow etc).
Those people should imagine this:
- - every stage of pipeline ......................... . as water channels complex
- - transistor switching every clock ......... as tsunami wave running in those channels
- - electron/transistor speed is constant ........ as tsunami wave speed is constant too
- - critical path ............................................. as the longest possible combination of channels to get to the end of stage (tsunami wave with the longest time to complete) - basicly this limits your frequency because you cannot run next clock/wave before last tsunami is finished. If you do, for example during OC, you get faulty results from logic and your PC crashes. That's why OCed CPU is stable at desktop (not using critical path) and not stable when you run heavy FPU AVX load (using critical path). I think this is the problem of Intel AVX512 down clocking too.
So from above we get:
- - P4 high frequency architecture has shorter/less complex channels... able to run more tsunamis at time (more freq)
- - P3 low frequency arch has longer and complex channels... able to run less tsunamis at time
- - all stages in pipeline should be as eqal as possible in terms of critical paths. This is needed for good power consumption and frequency scaling.
Beside this, to reach higher frequencies you have to adapt memory subsystem too. Typically to keep feeding high freq core you need bigger buffers for uncore because RAM latency is constant.
You can see that ARM cores performance doesn't scale as good as Skylake - due to their design. Apple desktop ARM CPU will need some modifications for high frequency for sure. However those changes are not major IMHO. Apple's 6xALUs design is their key advantage and huge milestone. High freq modification is easy IMHO.
ARM is not taking over x86 desktop because Cortex CPUs are very weak and useless. This will change when powerful ARM CPUs are delivered. Cortex A77 (4xALU) core is comming this year... this might fight with Intel Atom pretty well -> take over whole NAS markets and low cost laptops.