Lower latency might come in the form of a tighter physical integration, but that saves like, 1-3 nanoseconds at most. Any lower latencies Olympic Ridge might see would probably be down to the analog and SRAM people, along with node. I.E. a cycle or 2 saved on the L1, L2, L3, caches if you're lucky.
Historically die to die links in the realm of mobile to DT have been about lowering power and relatively cheap cost, at the compromise of whatever latency is deemed acceptable. This is different to maybe flagship EPYC or XEON platforms and GPGPU where you have money and power to burn and whoever's buying gives less of a shit what picojoules-per-bit your fabric delivers.
Look at the fabrics that Intel developed for Atom SOCs and sorta resurrected from the grave for Meteor Lake, Lunar Lake, Arrow Lake, etc. Or AMD's presentations in Client and Data Center every time GMI is updated. It's about being cheap, high (enough™) bandwidth, and power efficient. The solution to the latency question has historically been "Lol, that's up for the uArch and SRAM teams to figure out".
There's no free lunches in semiconductor design anymore. Data movement is routinely the most energy and cost expensive part of your design because bye bye DRAM bandwidth scaling vs. core count, bye bye SRAM and DRAM scaling, and bye bye PHY scaling. If you need throughput it's either moar clocks, moar width, or both. None of which help in keeping cost or efficiency down. Just look at Dragon and Fire Range idle about that. That's not to say a lower latency GMI successor for Zen 6 CAN'T happen, but it's probably not uber high on the list vs. pJ/bit