From a purely mathematical point of view that looks at the current physical packages, on desktop Ryzen, a switch to N5+ compute Dies and a smaller node for the I/O die (though, it doesn't have to be N7), would give AMD enough room to put three compute dies in the AM4/AM5 package. If they stick to 8 cores per CCD on N5+, that's still 24 cores. DDR5 should keep that well fed with data.
On TR/EPYC, a switch to a smaller process I/O die and N5+ compute dies should give them enough space for 12-14 compute dies (instead of pairs in the corners, it would be three per corner, and possibly one at either end of the I/O die, or, if the N5+ compute dies are small enough, it could be as many as three groups of three on each side of the I/O die and maybe one at either end, for a grant total of 20 compute dies... which is 160 cores, a crazy high, near outlandish number, and likely near impossible to wire to the I/O die successfully with even tech from two years down the road), which in a 2P system, is 24-28 compute dies, or 192-224 cores. With modest clocks, that should be doable within a reasonable power envelope. Just a number to think about, with 12 compute dies alone, and with no increase in L3 cache for N5+, that's still 384MB of L3 cache in each processor. If they elected to do a pair of HBM2E stacks in the package, one at either end of the I/O die, that could also include 8GB of L4 cache as well.
Those numbers sound unreal in the lens of 2016. Now, it's mathematically possible to fit it all in he package and not out of the realm of possibility.