@int64 , @tamz-msc
How do you people come to this conclusion? You have a SOC that has near 100ms longer memory latency compared to skylake, over 200ms to memory is atrocious (ddr4 2400!) . 2 core Cannon lake doesn't have anything special on the cache front ( exact same config as skylake) yet clock for clock has the same performance as skylake. So unless your assertion is the performance impact of near 100ms of extra access latency is 0 then cannonlake core increased IPC and offset the loss of memory system performance.
Note the “ms” is milliseconds, which you should not have as a latency figure unless you are talking about spinning rust random access latency. DRAM latencies are in nanoseconds. In the nanosecond range, even driving a signal through a wire on the chip can be significant. I believe the Pentium4 actually had two pipeline stages just to drive signals long distance across the chip. This is also why I was surprised that the 4 core cluster went away so soon. I guess they can get low enough latency to due a monolithic cache with 8 cores at 7nm+. I am not sure what the enabler is for that. Perhaps wire length is reduced significantly due to the actual area of the cache being much smaller for it’s size vs. 14 nm.