Originally posted by: Starglider
'SoothingRelease', your claims are not even remotely plausible. Here is a rough analysis of the theoretical capabilities of the two processors.
Both processors achieve the bulk of their floating point performance by vector operations on 128-bit registers. Normally this means processing 4 32-bit floats at once; double the vector GFLOPs numbers for the (rare) case of 16-bit floats, halve them for the (somewhat more common) case of double-precision floats. The Altivec instruction set includes 'fused multiply-add', which is probably implemented by the vector units of all the processing cores in question. This doubles the theoretical peak GFLOPs, for the unrealistic case that every instruction is a multiply-add. In theory each processor will do a minimum of one instruction per cycle if the decoders are kept fed, but in practice the PowerPC architecture has multi-cycle latencies that delay serial execution of instructions that depend on each other. These are 6 cycles for FPU operations and 8 cycles for vector FP operations on the G5, probably similar on the two console CPUs (maybe less for the PS3's SPUs as they have a simpler design). Finally the CPUs in both consoles run at a clock speed of 3.2 GHz.
The Xbox 360 has 'one VMX unit per core', which if it resembles the G5 means one fully functional vector execution unit and one multiply/add unit, plus two floating point units. The G5 has a theoretical issue width of eight, so the Xbox 360 is unlikely to be decode limited. The theoretical streaming compute power (for single-precision values) is 2 x 3.2 x 4 = 25.6 GFLOPS for the vector units and 2 x 3.2 = 6.4 GFLOPS for the floating point units in each core. That gives a theoretical upper bound of 96 GigaFLOPS for the Xbox (172.8 with only fused multiply-adds or 16-bit data), less than 10% of the 'well over 1 Tflop' that 'SoothingRelease' claimed. In practice interleaving vector and scalar FPU code is hard, and IPC will be dragged down by branches and serial dependencies. The XBox 360's single threaded IPC will probably be comparable to an XServe G5 running scientific applications; perhaps a little over 1. Assuming that the SMT is quite a bit more efficient than Hyperthreading in the P4, a sustained IPC of 2 might not be unreasonable, giving a total system performance of about 75 GFlops for six threads of vector-heavy code.
The Playstation 3's cores are issue limited; they have a maximum sustained IPC of 2. For pure vector operations this again gives 25.6 GFLOPS, this time for each whole core. This gives a theoretical maximum system performance of 204.8 Gigaflops (409.6 with fused multiply-adds or 16-bit data). However this cannot be achieved even in principle because other program logic including load-stores must compete for decode slots with vector operations. Combined with the lack of the ability to reorder instructions, I would guess a single-threaded IPC a little lower than the Xbox, I would guess about 0.8. On the plus side this will allow SMT in the PPU to work a little better, achieving a joint IPC of perhaps 1.5. This gives a total system performance of about 90 GFlops for nine threads of vector-heavy code.
The results of this rough analysis; the PS3 has a little over twice the /theoretical/ compute power of the XBox360, but probably only about 0-50% more on realistic game code. However the PS3 also has a much faster memory interface, a very efficient intercore interconnect and probably a faster GPU (the exact details aren't available yet). As such 'twice as powerful as the Xbox 360' is probably fair as a rough description.
SoothingRelease; the 1 TeraFLOP Microsoft claims for the XBox 360 is an 'overall system performance' that most definitely includes the GPU. This is made clear on the official spec page on
www.xbox.com. I haven't seen the power of the CPU alone stated anywhere. 200 gigaflops for the Cell sounds entirely reasonable as either (a) a theoretical peak value for normal operations or (b) a real benchmark using 16 bit values or fused multiply-adds. Sony claims 2 TeraFLOPs for total system performance, which implies a GPU twice as powerful as Microsoft's, but the relationship between theoretical peak FLOPs and actual performance is even more murky for GPUs than it is for CPUs. Regardless, 'the PS3 is twice as powerful as the XBox 360' still sounds right as an overall characterisation.
If anyone can see errors in the above analysis or has more specific information, please do correct me.