Here's a quick ramble from me regarding Alderlake
Diminishing returns apply for anything that increases size or power. If making a cache 2x bigger increases performance by 'y' then another doubling does not mean you are now getting '2y' more performance. Same for increases in structures like ROB, BTB, register file, fetch/execute/retirement/load/store width and so on. This applies even moreso for increasing the power budget, since power increases by the square of voltage, so raising the voltage to clock higher ramps power dramatically.
So I'm not really sure why you think that getting "only 33%" more IPC for 4x the area is a bad thing. Given that IPC is automatically increased as clock rate is decreased (and moreso when a design
targets that lower clock rate) a 33% lower IPC for the small core means it will likely be something like half the performance. Just as an example pulled from my ass: a small core IPC of 1.5 at 3 GHz vs a big core IPC of 2 at 5 GHz.
That's a smaller gap between big and little cores than Apple's (and even bigger difference to ARM licensed cores but their small core design is ancient) but who says Apple's 3:1 performance split between big and little cores is the right one? The power split between them is more important, but until we see benchmarks we won't know how Intel compares on that front. And again, who says Apple's power split (10:1 I think from memory?) between big and little cores is the right one? The area split is pretty much irrelevant, you devote enough area to the little core to meet your power and/or performance split goals.
Given how much Intel seems to have been reacting rather than planning over the past half decade with all the process delays and AMD becoming competitive, it is quite possible (IMHO likely) that these cores did not begin the design process with the intention they would occupy the same chip together, and would thus look unbalanced when compared to something like Apple's big and little cores. If so, that will resolve itself when the first designs of cores that were designed from day one to be on the same chip reach production.