What happens is this - the SETI program is about 800K in size when running. The 3.x client made it so that it can swap pieces of that (~200K each) into the small 256K cache of the Cumines and T-birds. However, that switching means that 600K goes into the slower RAM. NOW.... if you have
2 CPUs, each with a process doing that, both will be dumping (and fighting over) that slower RAM. Then if you add on the normal task and context switching, you'll lose some time when compared to running SETI on a single processor.
The only way to minimize this effect is to get a CPU with at least 1MB cache, like the Xeons. I run SETI on my dual Xeon III 500 1MB and generally get the same times running dual as when running singly. In fact, I've found that if I throw all the critical stuff onto CPU0 using 2K AS and setting the processor affinity, I can run the SETI process on CPU1 at "Realtime", which is impossible to do on CPU0... and it'll run faster over there than the process on CPU0 (obviously set to nothing higher than "below normal" or things will crawl)! But again, this is because of the Xeon's goods.
[EDIT: Fingers has the quicker fingers...lol!!!

]