Sounds a lot like chipset disaggregation with north- and southbridge.
The big challenge is finding a balance of following points:
- For the lowest common denominator in mobile space nothing will beat monolithic.
- I/O takes a lot of space for its interfaces, this is why having those on older nodes is a good way to reduce costs.
- With SRAM something similar happens right now, with 5/4 nm apparently being the wall.
- With logic using the latest node remains the best way to save area and increase power efficiency.
- The more data is being transferred the more power wise costly it is to move that data between disaggregated chips.
Regarding the GPU the more mobile chips become chiplets based the more likely we may eventually see a return of the Kaby Lake G structure: dGPU die traditionally linked to the CPU through lean PCIe lanes, with the GPU's heavy bandwidth needs served by dedicated HBM. This allows the remaining chiplets to be optimized for latency and power efficiency without the whole product missing high performance graphics.
In any case I'm really looking forward to seeing how exactly AMD and Intel are going to approach all these conundrums.