I might be a little off here, but I was under the impression that AMD's engineers were working with slightly better branch predicition.
Not that much attention is paid to such things, and you can't really attribute a CHUNK of extra "speed" to good branch prediction, but on a design like...