Originally posted by: zephyrprime
All the info in x-bit labs is pretty speculative at this point. All AMD has really told us is that k8l has twice the fpu resources, a doubled in size instructio prefect, and improved IPC. There's just no telling what all this will mean in specific terms. K8L will certainly be faster than the K8 but by how much is unknown.
However, x-bit gives us a pretty in depth speculation on the performance provided by some elements of the K8L. I'll try to condense what they have to say:
1. With a 32B instruction fetch (instead of 16B like in the K8/P4/C2D), being starved for instructions will be less likely. This is useful since SSE instructions are big and the SSE issue rate will be a lot higher with the K8L since it has real 128bit SSE execution instead of the half assed 64bit/64bit sse execution we now suffer with in the K8 (and P4). Decoding sse into real 128bit instructions on the K8L is easier than decoding an sse instruction into 2x64bit instructions as is done in the K8.
Also, 64bit execution should be sped up by the 32B prefetch because 64bit instructions are bigger than their 32bit counterparts.
(zephyrprime opinion: However, we don't know if the K8L has more instruction decoders so the picture of K8L instruction decoding power is very incomplete. I think xbit is saying that is doesn't but is that just a guess or is it a fact?)
2. Branch prediction is improved but specifics are unknown.
3. There will be some sort of read reordering but specifics are not known.
4. The K8L will retain separate instruction pools for int and fp code.
5. The K8L will have 2x128bit connections from SSE to L1. (z.p.: vs 2x64bit in the K8 and 1x128bit in the conroe).
6. The l1 & l2 caches sound unchanged. The l3 is new (obviously). The crossbar is enhanced. Doesn't sound like K8 will have fancy prefetching like the Conroe.
I could have made some mistakes. It's tough to read what those x-bir guys wrote.
I think it's highly informative to look at pictures of the K8L die and the K8 die.
http://www.techwarelabs.com/reviews/processors/amd4000_fx55/die_marked_E.jpg
http://www.xbitlabs.com/images/cpu/amd-k8l/image001s.png
From the pictures, you can see that the layout of the individual core's is basically the same. The most obvious difference is the smaller L2. Also, the additional SSE unit can easily be seen and look extremely similiar to the current FP unit. The L1 caches look basically the same. The instruction decoder looks really different. The load/store unit looks like it has significant changes. The int unit looks little changed. The bus unit looks like it has some changes.
Since the size of the int unit is approximately the same, I would guess that there are no new int execution resources. The decoder doesn't look any bigger either so I guess there will be no additional instruction decoders.
In SSE&FP code, I think the K8L will beat the Conroe. In int code, I'm guessing Conroe will win unless there is some fancy prefetching in the bus unit which doesn't seem to be the case at this point in time. It looks like the K8L will have the same number of pipeline stages as the K8 so I would speculate that the Conroe will clock higher than the K8L. In 64bit code, the K8L should have a big edge with its 32B fetcher.