veri745
Golden Member
- Oct 11, 2007
- 1,163
- 4
- 81
I was curious about that at one point in the past though I kind of lost interest. One thing that everyone knows is that Intel's L3 is 'inclusive', and AMD's L3 is 'exclusive'. By 'inclusive', it means L3 contains the duplicate data of L2. So when a core needs data that's just used from a different core, it can replenish the data from L3 without going to the system memory.
A downside to that is it uses extra die space and likely more power consumption, but since Intel's L3 is 8MB and AMD's L3 is 6MB and Intel has implemented sophisticated power planes (including Turbo) for Nehalem I think it's clearly a win for Intel in this case.
A somewhat more complicated to me is the associative. Phenom II's L3 is 48-way set associative, and i7's L3 is 16-way. I read a little bit about it but it was a long ago so I don't remember how they differ in design philosophy and in performance. I may look it up when I get a chance.
Increased associativity will decrease cache misses, since it allows more memory addresses to reside in a given cache block.
On the flip side, it will probably increase latency, although i'm not sure by how much.
As with everything else, which one is better depends on your load. Greater associativity will be better with more random memory access, and less will be better with more serial memory access.