AMD's announcement of "V-Cache" points to another way Apple could theoretically handle GPU memory on the M1 successor. Apple has always been on the forefront of TSMC's vertical packaging technologies, so it is easy to imagine they might use it for something similar.
AMD was able to fit 32MB in 36 mm^2 in 7nm, so if Apple made a similar SRAM in 5nm at the same size I've estimated for the M1 successor of 225 mm^2 it would hold 256 MB. If it had "only" the same 2 TBps in bandwidth AMD reached, that's 2 TBps per chiplet, which would be an amazing 8 TBps in a 4 chiplet Mac Pro! Of course, for the purposes of graphics, using SRAM would be wasteful, since the reduced latency isn't necessary. If they used eDRAM (i.e. DRAM which can be made in TSMC's standard logic processes) which is around 4x as dense, they'd get 1 GB per die or 4 GB in the high end Mac Pro. I don't know the bandwidth difference between SRAM and eDRAM, but 8 TBps is kind of ridiculous and probably couldn't be fully exploited anyway.
One flaw in this plan is that 4 GB is not sufficient - Nvidia and AMD's highest end workstation GPUs have 32 or 48 GB of memory. However, Apple could stack multiple eDRAM dies per chiplet to reach whatever capacity is desired. I've figured previously based on TSMC's wafer pricing that Apple would be paying a little under $100 for a 225 mm^2 chip in 5nm, which if you add a bit to account for defects that can't be handled by redundancy we'll call exactly $100. So each eDRAM die of matching size would have similar cost - actually slightly better since the massive redundancy there should allow for essentially perfect yield but we'll still call it $100 since the math is easy. The question is, do they really need 48 GB like Nvidia's highest end workstation GPU? At essentially $100 per gigabyte that would be $4800 just in eDRAM dies!
That's a LOT more expensive than GDDR6 which as far as I can tell seems to be closer to $10 per gigabyte, so while it doesn't make much difference in lower capacities the price gap gets really large once they get enough to compare to AMD & Nvidia's top end workstation GPUs. Here's where it gets interesting though - by vertically stacking Apple can get much higher bandwidth than they could with GDDR6 which would use standard DRAM channels for communication. That might not be enough to justify the added cost, but rolling your own DRAM also allows capabilities impossible for even AMD or Nvidia to exploit - they could embed computational capability in the eDRAM array. That's been talked about for many years as a potential way to greatly accelerate certain tasks, but hasn't really been possible since commodity DRAM so thoroughly dominates the market. Perhaps in the limited case of a tightly coupled GPU and VRAM its day might finally come.
If Apple felt the cost was worth it for whatever advantage it provides, they could select a few eDRAM capacity points for the lowest to highest end Mac Pro and Macbook Pro, and vary the number of eDRAM chips stacked per chiplet/SoC. So you calculate $100 per chiplet plus $100 per GB of eDRAM, and add $100 per chiplet/SoC/eDRAM stack for testing/packaging to get a rough estimate of Apple's costs. Despite the far higher cost for eDRAM vs GDDR6 it should still stay comfortably under what Apple pays for the low/medium/high end Intel CPU + AMD GPU in the currently Mac Pro and Macbook Pro, so the cost works.
So I can't find any way to rule this out, it fits as far as cost, Apple's technical ability and willingness to use TSMC's bleeding edge packaging, and Apple's willingness to roll their own solution and not be dependent on third party silicon if they can produce something better themselves. While going a more traditional direction with commodity DRAM of some sort is more likely, if you're willing to do your own CPU and GPU, why accept a commodity solution for DRAM if you have the ability to do much better?