You can compare core either with L2$ or without. Mixing it together is monkey logic.
Including L2 cache:
- - Zen2: core Mtr= 3.6mm2 * 52 Mtr/mm2 = 187 Mtr
- - A13: core Mtr= 4.5mm2 * 86 Mtr/mm2 = 387 Mtr ..... 2.1x more transistors
Excluding L2 cache:
- - Zen2: core Mtr= 2.7mm2 * 52 Mtr/mm2 = 140 Mtr
- - A13: core Mtr= 2.6mm2 * 86 Mtr/mm2 = 223 Mtr ...... 1.6x more transistors
Looks like some people has a problem to accept Apple's monstrous transistor count same way they had a problem to accept its IPC.
You are right, we must normalize for L2$ or lackthereof, because it's a true apples to oranges comparison to compare a Zen2 core + 512KB dedicated L2$ to an A13 core + 8MB shared L2$. L2$ takes up a lot of die space and is higher density, skewing the results.
Excluding L2$:
The Zen2 core 2.7-2.87 mm2 (smaller than before, because we are excluding L2$)
The A13 core 2.6-2.61 mm2 (same as before)
Including 8MB L2$ on each:
Since you're using specint2006 (single threaded) as your benchmark of choice, then the Lightning core would be able to use all 8MB of the shared L2$ (in reality, for reasons unknown, likely because it seems like 2MB may still be earmarked for L2E for the Thunder cores, it only uses 6MB at a really brisk rate - that's really Apple's decision, but nonetheless the 8MB L2$ is shared between the two lightning cores and technically, a single-threaded application would use all 8MB).
L2$ area for Zen2 = 0.8mm2 or so, times 16 to get 8MB for the core - that gives us 0.8mm2 x 16 = 12.8mm2 additional die area.
Zen2 + 8MB Zen2-density L2$ = 16.1 mm2
A13 + 8MB A13-density L2$ = 4.5 mm2
If we just include same L2$ size from A13 and apply it to Zen2, to try to make it more even:
Zen2 + 8MB Apple L2$ = 6.5 mm2
A13 + 8MB Apple L2$ = 6.4 mm2
Even if you do broadly apply transistor density of each chip to its core, in no way would you end up with "double" the transistors on an A13 core unless you're including it's ridiculously large L2$. I agree, yes, L2$ has a lot of transistors.
Explanation is simple: Apple's enormous IPC performance has to come from somewhere.
It probably comes from an L2$ that's 16 times the size of that on Zen2.
If you'll reference the cache size sensitivity of specint2006, you'll notice that miss rates are going to vary massively between a 512KB cache and an 8MB cache. Using that single test as your lodestar is misleading you because it can be manipulated so easily, or, in Apple's case, they are designing around single-thread performance and so a massive L2$ and 2 Lightning cores makes much more sense.