There has been rumors of such a device for a long time. AMD has shown some infinity architecture slides with 8 GPUs almost fully connected with 6 infinity fabric links. That may not be an actual implementation, but the Genoa IO die might have 6 links on each side. That would allow placing an HBM-based GPU on one side and CPU die on the other. That would allow the gpu access to possibly 12 channel DDR5 through 6 links in addition to HBM cache.Hmm, carrying this idea forward, this could be the core of a highly converged product, very similar to an APU, but, in an EPYC-like MCM with an IO die, a pair of Zen CCDs for flow control, a pair of CDNA GPUs chips, and four HBM2E or maybe even 3 stacks to feed those GPUs. I don't think that the 8-12 channels of DDR5 ECC RDIMMS will have near enough bandwidth to directly feed the GPU die, but, if we imagine the HBM chips in the same role of the infinity cache, at 8-16GB per gpu chip, and considerably more through the IO die, maybe we have something interesting. Instead of remaining at just 2P systems with a ton of PCIe lanes to expansion cards, you have quad socket systems with sufficient IO for system connectivity.
It would be great if we get such a device. It would be very compact, very high performance, and efficient. With stacking technology, they could stack cpu cores on top of the IO die and place a gpu on either side. It makes a lot of sense, but this has been rumored for a long time, so it may still be bogus. It seems plausible for the non-stacked Genoa though.