Originally posted by: Accord99
Originally posted by: Matthias99
There is no packing and unpacking in FP calculations.
There is when they've moved back and forth from memory or cache, or if they have to be stored in 32-bit registers. The FPU and its handful of registers may be 64-bit, but the rest of the system is not.
The memory bus on modern platforms are 64-bit or 128-bit, the P4 has a 256-bit bus to the L2 cache, and has cache-lines that are 128-bit in size. x87 utilizes 80-bit registers and the P4 has a huge number of 128-bit rename registers for FP/SSE2. The main determination between 32-bit/64-bit CPUs is that a 64-bit CPU has integer registers that are 64-bit in size, and has nothing to do with FP at all. [/quote]
Okay, okay, so I guess caching floats doesn't hurt performance as much as I thought it did.
🙂
I don't really want to get into a pissing contest over this. As best as I can find, the P4 has 8 32-bit GPRs, 8 80-bit x87 registers, 8 64-bit MMX registers, and 8 128-bit SSE1/2 registers. I'm not sure whether or not 8 is in the "huge" range
😛. I also thought that the SSE registers couldn't be used directly for non-SSE floating-point operations (something like you couldn't have it be both source and destination for an FP op, but maybe that's only while you're using MMX or SSE1/2). If I'm wrong here, or you have better info, please point me towards a technical document or something, as I'm somehow not having much luck finding anything useful with Google right now.
Opteron extended registers:
http://www.tomshardware.com/cpu/20030422/opteron-06.html#a_big_deal_opterons_64bit_registers
Will it translate to better FPU performance? Double the GPRs, with each able to hold a 64-bit float if necessary -- along with twice as many SSE1/2 registers. I would think that performance with double precision floats in 64-bit mode would be significantly better, although by how much I couldn't tell you without benchmarks and/or more info on the rest of the floating point architecture.
Maybe, but the current 64-bit client of SETI is slower than the 32-bit.
http://www.amdzone.com/articleview.cfm?articleid=1315&page=2
This is a problem with the 64-bit SETI client, not 64-bit computing! Read the article... it sounds like they didn't really do a full "port" of the application, just a recompilation for the new processor.
Or that SETI, like many applications, has no need for 64-bit computing.
Considering that SETI is pretty much nothing but FFTs and other forms of analog signal processing, I would think it would have a *lot* to gain from a 64-bit architecture if it significantly improves FPU performance. But unless you have the source code handy, it's really just speculation at this point.
Will 64-bit desktop computing pan out? I have
no idea -- and neither does anyone else, really. If it significantly improves performance in real-world applications and the specialized fields that can use it (and we'll have to wait for 64-bit Windows next year to tell) at a minimal cost, then it will. If it does nothing but cost more, then it probably won't.
😛 We'll have to see what happens when Intel gets Prescott rolling -- a (hopefully) mature, super-fast 32-bit processor versus an unconventional, untested 64-bit newcomer. It ought to be fun to watch.
🙂