The cache is the first step into chiplets. A GPU based on chiplet simply needs an on-chiplet cache because every memory access goes through the IO die meaning higher latency and higher power use. It's the same thing AMD did with Zen2 with the large L3.I am initially skeptical on why a GPU would need very large cache, as the parallelism of GPUs hides memory latency. The die area used could have been used for more cores instead.
On top of that cache scales very well with process node (in contrast to bus size) and it at least partially if not fully solves the iGPU bandwidth problem. Of course there are trade-offs and looking just at N21 I can understand that it might not be worth it but in the bigger pictures, there is probably no way around it.