Would AMD when they started designing Zen2 several years ago, be willing to take such a radical departure. Remember, every $ was scarce. I find it difficult to see the basic unit not being a complete stand alone CPU.Increasing the number of basic units for EPYC would require additional IFOP connections on the chips and more reservation spots/switching targets in the internal IF uncore of the chips themselves. If you're going through all of that, it's likely going to be no more complex to add two more CCXs to the existing floorplan. While that will definitely up the transistor count, it shouldn't increase the effect beyond the existing 14/12nm true area.
Interestingly, though, if they wanted to, they could keep roughly the same basic layout of the individual chips at 7nm, but move the DRAM and IO controllers off the chip and onto a specific I/O chip, leaving the rest to be essentially CCX and IF chips on the same EMIB/MCM package. So, have 5 chips on one package, four with IF links to the 5th, and the 5th handling all the I/O between the package and the rest of the system. This way, they can change out DRAM controllers, PCI controllers, etc without having to redo the whole chip, or update the package for different applications in isolation of the cores. Having an EMIB/MCM package can allow them to run the IF links between the chips at similar speeds to what they do internally in the chips today. Consumer chips could be a mix of 2 to 4 7nm chiplets, and an I/O chiplet, and maybe contain an iGPU chiplet as well on dual CPU chiplet packages. At 7nm, but maintaining the existing AM4 socket, they'll have plenty of package size to play with for things like that. It would even be possible to integrate an HBM package in there as well. On a desktop product, cooling a package with two CPU chiplets, an iGPU chiplet, an HBM stack and an I/O chiplet would no be unreasonable. With low enough voltage and frequency targets, it could even work on mobile. Intel is already there with KL-G. Their pricing on the product is indicative of their uniqueness in the market and not entirely a product of cost of production.
In their papers on interposer connected, composite CPUs, one of the points stressed was the ability to migrate early to an advanced node even if yields were comparatively poor and use innovative topologies to connect into a high core count CPU. That was the focus of the research. How to overcome the problems of early node fabrication. Other benefits were better binning options, etc.
AdoredTV did a recent video but this topic has been discussed here a long time ago using the same PDFs mentioned in his video.
They will have the greatest advantage now as they migrate to 7nm. Seeing as this move has been planned years ago, I'm pushed into the expectations I have.
Can an organic package accommodate the required connections for an 8 chiplet CPU? Nope. Seems as if a SI is needed for all the chiplets.