Link
One has to wonder why AMD included SSE2 support if their implementation peforms as badly as this article claims. I suppose that in the future, lack of SSE2 might be a liability but it seems a very slow implementation is just as much a liability.
Given that the among C/C++ compilers available today, the Intel compiler appears to generate the fastest-performing code for Opteron, it seems tempting for Intel to further tweak the compiler to take further advantage of SSE2 wherever possible.
It's also interesting that even with SSE2 turned off, the AthlonXP 2600+ handily beats the Opteron 242 using TMPGenc.
One has to wonder why AMD included SSE2 support if their implementation peforms as badly as this article claims. I suppose that in the future, lack of SSE2 might be a liability but it seems a very slow implementation is just as much a liability.
Given that the among C/C++ compilers available today, the Intel compiler appears to generate the fastest-performing code for Opteron, it seems tempting for Intel to further tweak the compiler to take further advantage of SSE2 wherever possible.
It's also interesting that even with SSE2 turned off, the AthlonXP 2600+ handily beats the Opteron 242 using TMPGenc.
