Hi -- Rand suggested I ask for a bench on either an XP or MP system, to get a sense of the capability of the Palamino core. I've reprised my previous post, and added an extra link to the zip file, in case one ISP goes down.
=====================
Howdy -- Is there anyone out there who might run this
SSE-enabled benchmark on an Athlon XP system? This
benchmark was discussed on the Ace's hardware tech forum
back in the spring; there is now a slight update on the SSE
optimizations. I also have a statically-bound Linux x86
version, if anyone is interested. If there is a worry about
viruses, I will gladly e-mail from my gov address, which
should provide some traceability.
Download from either:
http://www.wizard.com/~hwstock/bench/3d0/3d0.zip
or
http://home.earthlink.net/~stockman3/bench/3d0/3d0.zip
unzip to obtain the 3 files. Open a console (ms-dos
box) into the directory with the 3 files. Execute
the batch file now.bat from the command line of
the console. Return the lb_data.txt file to me (and
return the bmp files if you so choose, so I can verify
that the test ran correctly).
The full test may take a few minutes. If you don't want to
tie up the machine for that long, edit the batch file
now.bat, changing the switch -s1000 to -s200
(i.e., run 200 steps instead of 1000).
The fastest previous Athlon results were: 2.135 MUPs for a 1.2 GHz Tbird with 2-2-2 SDRAM.
The fastest previous P4 results were: 5.723 MUPs for a 1.4 GHz P4 with 1 GB RDRAM.
Higher MUPs (millions of updates per second) are better.
However, the Athlon exe used hand-coded 3DNow under MS VC++, whereas the P4 used SSE intrinsics under Intel C++ 5. The Intel compiler generally produces superior code. In principle, the SSE-enabled executable should run just fine on an Athlon XP (it is compiled for PIII instructions, and even runs on my Celeron CuMine), and the memory interface on the Athlon XP (with DDR) is supposed to be a significant improvement.
I won't try to mislead you -- I'm not expecting the Athlon XP to be a knock-out performer for this code, because I don't think DDR is yet on a par with RDRAM, and the code is memory-intensive. However, I have two Athlon systems (and one P4, a Celeron, and two PIIIs), and I would love to be pleasantly surprised.
If you want to know more about this code, visit these web sites for background info:
http://www.sandia.gov/eesector/gs/gc/hws/saltfing.htm
http://www.sandia.gov/eesector/gs/gc/hws/3d.htm
If more info is needed, I can e-mail (or post links to) two pdfs from peer-reviewed scientific journals, one of which gives details of the code architecture and optimization strategy.
Thanks much -- I'm under pressure to provide the most up-to-date assessment of the Athlon XP performance, but there are few XPs "in the wilds" as yet.
=====================
Howdy -- Is there anyone out there who might run this
SSE-enabled benchmark on an Athlon XP system? This
benchmark was discussed on the Ace's hardware tech forum
back in the spring; there is now a slight update on the SSE
optimizations. I also have a statically-bound Linux x86
version, if anyone is interested. If there is a worry about
viruses, I will gladly e-mail from my gov address, which
should provide some traceability.
Download from either:
http://www.wizard.com/~hwstock/bench/3d0/3d0.zip
or
http://home.earthlink.net/~stockman3/bench/3d0/3d0.zip
unzip to obtain the 3 files. Open a console (ms-dos
box) into the directory with the 3 files. Execute
the batch file now.bat from the command line of
the console. Return the lb_data.txt file to me (and
return the bmp files if you so choose, so I can verify
that the test ran correctly).
The full test may take a few minutes. If you don't want to
tie up the machine for that long, edit the batch file
now.bat, changing the switch -s1000 to -s200
(i.e., run 200 steps instead of 1000).
The fastest previous Athlon results were: 2.135 MUPs for a 1.2 GHz Tbird with 2-2-2 SDRAM.
The fastest previous P4 results were: 5.723 MUPs for a 1.4 GHz P4 with 1 GB RDRAM.
Higher MUPs (millions of updates per second) are better.
However, the Athlon exe used hand-coded 3DNow under MS VC++, whereas the P4 used SSE intrinsics under Intel C++ 5. The Intel compiler generally produces superior code. In principle, the SSE-enabled executable should run just fine on an Athlon XP (it is compiled for PIII instructions, and even runs on my Celeron CuMine), and the memory interface on the Athlon XP (with DDR) is supposed to be a significant improvement.
I won't try to mislead you -- I'm not expecting the Athlon XP to be a knock-out performer for this code, because I don't think DDR is yet on a par with RDRAM, and the code is memory-intensive. However, I have two Athlon systems (and one P4, a Celeron, and two PIIIs), and I would love to be pleasantly surprised.
If you want to know more about this code, visit these web sites for background info:
http://www.sandia.gov/eesector/gs/gc/hws/saltfing.htm
http://www.sandia.gov/eesector/gs/gc/hws/3d.htm
If more info is needed, I can e-mail (or post links to) two pdfs from peer-reviewed scientific journals, one of which gives details of the code architecture and optimization strategy.
Thanks much -- I'm under pressure to provide the most up-to-date assessment of the Athlon XP performance, but there are few XPs "in the wilds" as yet.