- Sep 11, 2007
- 1
- 0
- 0
I have been tasked with evaluating hardware platforms at work for an extremely demanding FP intensive algorithm. The nature of it is confidential; however, the gist of it is a large lagrangian matrix needs to be solved for an optimal solution using double precision floating point calculations. This solution typically takes almost 100 million iterations to solve and runs for about 4 minutes on a 1.15Ghz Alpha 21264. We have evaluated Itanium 2s at 1.6Ghz and have seen only a modest improvement (10-20%). The problem I am having is that all FP benchmarks I find are multithreaded. Because, my application needs the output of the first iteration before the second is run; I am stuck using only one core. Ideally, I would like to just port the application to Power 6 and x86-64 and run a benchmark. However, that port is extremely expensive and I need to have a high level of confidence before taking on that expense. My guess is that Barcelona?s FPU would offer the greatest performance for the money invested. However, a port to prove that can not be justified unless I have direct comparison of single threaded performance between Alpha 21264, Itanium 2, Xeon, Opteron and Power5/6. Below are a few more details about the specific problem. Please respond if you have any suggestions or need more details.
FORTRAN algorithm
Double float precision
Very serial
Large matrix manipulation
FORTRAN algorithm
Double float precision
Very serial
Large matrix manipulation
