SLC cache = System Level CacheM1 using an SLC cache? It's gonna wear out eventually. What then? Does the device go into a low performance mode? Or does it stop working altogether?
Memory bandwidth issues may get solved with four channels of DDR5 in the future though that's likely a year or two into the future.
If Parallels can deliver the 70-80% performance of native x86 code execution, that could conceivably meet or exceed the highest level of performance delivered by contemporary high end x86 CPUs.
Apple seems to be content with targeting the content creation market. The x86 gaming market is safe for now. Intel and AMD can breathe easy. The only other company with the remotest chance of challenging Apple in performance could be nVidia with their ARM IP but that could take maybe 5 years or longer.
Something like an L3 cache but optimized for
(a) saving power by allowing the GPU frequently to avoid accessing DRAM
(b) allowing all the different accelerators to exchange data with each other and the CPUs rapidly.