Good catch, I forgot it was in slices, duh. Having to snoop another CCX's L3 through the IOC will be necessary for Rome's cache coherency protocol, so I'd expect it in Matisse as well.Technically it's not 16MB per CCX but 4MB per core. Every core has write access to its own 4MB L3$. Every core has read access to all L3$ of the whole chip. Obviously read access to the 12MB L3$ of the other cores within the same CCX is faster than the L3$ of all the cores on other CCX's.
I believe that even for the two CCX's on the same chiplet the access between them causes a round trip through the IOC. But that will have to be tested once the chips are publicly available.