You won't be fitting 48 e cores in the space of an 8+16 die. The e core quad complex is larger than a P cores. The likely best configuration would be a performance for with 2+P die(extra dark silicon for higher clocks), 8 regular P die, enlarged L3 on a die that uses HP libraries + a second die with 40 e cores in ten quads on a die that uses HD libraries. 50 cores, maximum ST, tons of MT.
Gracemont Quad was only about 15% larger than a P core, and Uncore portion takes up quite a bit. Skymont quad is still not enormously larger than a P core either.
Now that's changing again on Novalake gen. However, 32 cores should be easily doable.
But it won't fly with consumers, even HEDT. Even just 10% slower on ST makes it a deal breaker, and I assume with E cores having ~5% perf/clock advantage, the clock difference means the gap will be about 20%, a significant number.
Consumer gaming chips are the most demanding bunch to make. You need highest ST performance, very good MT performance, fairly low cost(meaning small die size) and do all that in a socketed system with widely varying memory and motherboard configurations, sometimes down to 6-7 generations of compatibility.
That would be far more optimum for those who really need to crunch numbers. (8P+16E) + (48 E) = 72 total cores (plus any LP-E cores). That would be far more powerful. But, yes, out of convenience and mass production, Intel will reuse the 8P+16E tile.
It's not just convenience. Having different core counts on different tiles means different communication, cache, and IO latency. If you are impacted by different tiles on current Ryzen chips with identical links to the IO die and identical compute modules, what will happen when one has 8P+16E in one tile and 48E in another?
On the Ryzen chips when it goes to another chip it means different cache latencies and gamers optimized based on that. On a different core module it means an entirely different uarch. With such different setups everything is affected, down to even storage performance.
When the first Athlon came out reviewers noted it had an advantage processing IO performance, thus storage performed better.