And remember, the E cores only need to be orthogonally complete, they do NOT have to be performant on AVX512 code. Double pumped 256 data paths like AMD's mobile cores? Yes please. Quad pumped 128bit paths? Why not. Extra XTORS for higher clocks? Don't think so.
There's only 12% difference by enabling 512-bit datapath. Most of the AVX512 gains are because of the instruction set.
www.phoronix.com
Off: 1x
256 mode: 1.3x
512 mode: 1.45x(12% over 256 mode)
There's better ways of using transistors rather than wasting power and die for 512 bit vector units, which is a big increase. Like if they improved the uarch further, it would bring gains everywhere, including on AVX512 workloads. A hypothetical future core that's 256 mode but gains extra 5% due to further uarch improvements would reduce the differences versus 512 mode to a mere 6%, while being faster everywhere else, lower power, and smaller core size.
For AMD, 256 mode Zen 6 will likely be equal to 512 mode Zen 5. Yes in cornercase scenarios it'll be better, but you are bandwidth limited in most cases, and end up better in power and area.