All the same, right..?
Not to be rude, but I really do think you are the one missing the point. HSA is something AMD has been investing in, and moving towards for seven years now. This "fabric" is just not something thought up, it has been in the making for a long time, with a lot of forward thinking engineering going on. And I think a few patents too..
We are starting to see the first iterations of AMD's "infinity fabric" being laid out across it's entire portfolio and CPU & GPU, etc. (This is gen 1).
Good reflections w.r.t to relating infinity fabric to their longer term HSA plans. This is indeed the connection and the path towards it.
Infinity Fabric is a far more elegant and wide ranging solution to multi and many core solutions. Infinity fabric is an elegant pipe from one place to another. Once a transmission falls out of the big pipe (infinity fabric), then you can decide on how to more effectively and even proprietarily pipe it locally. As such, (infinity fabric) can be used across AMD's CPU/GPU line and it can now also open it up to third party venders. Imagine a future in which AMD exposes infinity fabric externally and you now have memory/accelerators directly connected to the CPU as if it were another CPU socket as in 2-socket naples...
Not to mention, people are ignoring that this design allows for (local) piping. Local CPU cores to a CCX incur ~40ns latency. Whereas the latency from core to core is ~100ns with intel. Also, since AMD is running the infinity fabric in relation to DRAM speeds, the latency of infinity fabric communication drops down from 140ns to 110/120ns with higher speed ram. So, in Ryzen 8 Core/16thread, you're incurring 40ns latency for 4cores/8threads. You only hit 110/120ns when going across the fabric interconnect. Not to mention that they're likely using less power to do so. A fully meshed design that doesn't use an aggregation pipe is hella power intensive.
I see Intel's "mesh"... as a new universal interconnect. Acting as a bus for cross-talk, & back-talk I/O & IOPS. To overcome some known limitation in it's current designs, etc.
I am more than looking forward to seeing how this reduces system latencies, etc. And how it offers greater internal bandwidth, etc.
I too look forward to more details coming out on both. I find both approaches to be very interesting/intriguing. However, I do note, like you that AMD is playing a longer term game here and is doing so quite exceptionally even on present-day performance vs. mesh.
I have been doing some digging and it seems few people except for HPC really dig into micro-architectures and link them to real world performance. It will be interesting to read a detailed writeup on how these unique micro-architectural designs map all the way up to the real-time performance one sees. What people often ignore nor detail in enthusiasts benchmarks is the complexity at runtime w.r.t to all sorts of latencies that factor in to the picture that way out-scale 100ns. Thread data lock latency, thread starvation, context switching, cache misses, kernel call, user space to kernel space switching, etc are all real-world phenomena when you're running an OS and an application within it... 200 processes with 1000 threads isn't unheard of for an 'idle' system. Those threads and processes don't just magically and instantly begin running on the CPU and there's latency involved when they do (tremendous latency)