Over at Ace's Compiler Analysis article, they take a look at the next generation of compilers that will be used shortly, MSVC7 and Intel 5.01.
The analysis is limited to FP performance only, in flops, (as opposed to real-world apps), however, the results are very interesting.
MSVC7 tends to not help the P4 or P3 much at all, while giving a dramatic boost to the Athon versus older versions of the MSVC compiler (the most popular compiler in use today).
OTOH, the Intel 5.0.1 compiler, when able to vectorize the flops code (unlike previous 5.0, which could not), resulted in a 204% performance increase -- definitely startling. This brought it from significantly slower than the Athlon, to moderately faster.
...
What does all this mean in the real world? Most apps are blends of integer/fp/branching/memory-constrained code, which means performance is gated by the weakest link. On the Athlon, that's the memory and cache system, while for the P4, it is branching/integer performance. So the superiority or deficit of one CPU over the other is an infinitely more complex question.
Still, an interesting article.