- Jul 28, 2019
ILP wall is valid for in-order CPU uarch only. In OoO CPU there is massive reordering and parallel speculative execution, loading etc. CPU is limited by its OoO engine window size and how efficient predictors are. So for OoO CPU there is no IPC/ILP wall. However I don't say it's easy to get more IPC.Oh yes infinity is the limit
I invite you to search for ILP wall.
Also adding transistors means more power something that is not always acceptable.
More transistors doesn't mean more power. Apple is good example how despite almost double transistors A13 Lightning core can be twice as efficient as Zen2. Apple must be able to power gate a lot of parts inside core when not being used. There is no other way. Something like Cortex core when detecting high miss prediction rate it minimizes speculative execution to save an energy. And imagine Apple is at least 4 years ahead in development than everybody else...
Regarding A13:Let’s take a look at the A13 for a moment:
2x2.6 GHz big cores @ 5-6 watts of power. Multiply that by 4 to get an 8 core 8 thread - 20-24 watts Wait! There isn’t any hyperthreading, no DDR4, No PCIE, Actually, come to think of it, that smartphone SoC is missing every single major feature that modern machines have. Before you know it, Apple has blown past a 45 watt TDP.
ARM CPUs, including those from Apple look very attractive until you get into the nitty gritty of it. To scale up any ARM CPU just means you’ll end up with a similar perf/watt to an x86 CPU. AMD and Intel aren’t sandbagging, they have to deal with the laws of physics just like Apple/Intel do.
You are right that 2xA13 Ligtning has 5-6 Watts and 8xcore would consume 20-24 Watts. But Andrei measured system consumption including dual channel LPDDR4 .... so that 8-core A13 with 20-24 Watts would include 8-channel LPDDR4. And PCIe links power consumption is not that much and in reality it's just a fraction of CPU cores.
And best thing at the end: those 8xA13 Lightning cores at 2.6 GHz delivers performance equal to 8xZen2 at 4.7 GHz...... and show me Ryzen CPU which can run its all eight cores at 4.7 GHz simultaneously at 24W (even if you would find such a rare super chip it would consume 150W). APU Zen2 Renoir shows definitely better efficiency but again AMD selects the best low leakage dies for laptops and rest go to desktop later this year. Every Apple's A13 can reach such a performance which means there is some decent performance margin.
Regarding similarity between ARM and X86 when scaling up:
Did you see Andrei's test of Graviton2? I doubt.
32c/64t Zen1@2.9GHz vs 64c/64t A76@2.5GHz
- higher performance per thread despite lower frequency
- higher MT throughput
- lower power consumption by half (180W vs. 90W).
- even 14nm vs. 7nm cannot explain such a difference in power consumption
- cheap to manufacture due A76/N1 core being only 1.4mm2 in compare to Zen2 3.4mm2