- Mar 10, 2006
- 11,715
- 2,012
- 126
I was thinking about that and wondered if AVX-512 couldn't be one of the reasons to have increased L2. Also if the interconnect changed and has more L3 latency, then having more data closer in L2 would be beneficial.If we work under the assumption that the world's best server processor maker is competent, we should be trying to understand why Intel made the choices that they did.
Well that explains the change in associativity of Skylake. 512KB = 4 way, 1MB = 8 way. I think some people over on RWT were speculating about this a while back.
Nice upgrade- looking forward to seeing how much difference it makes. Should help a lot with workloads that use those AVX-512 vectors.
Nehalem's L2 cache does get a bit faster, but the speed doesn't make up for the lack of size. I suspect that Intel will address the L2 size issue with the 32nm shrink, but until then most applications will have to deal with a significantly reduced L2 cache size per core. The performance impact is mitigated by two things: 1) the fast L3 cache, and 2) the very fast on die memory controller. Fortunately for Nehalem, most applications can't fit entirely within cache and thus even the large 6MB and 12MB L2 caches of its predecessors can't completely contain everything, thus giving Nehalem's L3 cache and memory controller time to level the playing field.
The end result, as you'll soon see, is that in some cases Nehalem's architecture manages to take two steps forward, and two steps back, resulting a zero net improvement over Penryn. The perfect example is 3D gaming as you can see below:
EDIT: Random thought, has there been any information about the Skylake Xeon D? I wonder if that will use SKL or SKX cores?
I don't think there will be one. Probably wait for Cannonlake (Mainstream?) instead. It may be the first product released on 10 nm.