barbary - if you are referring to the fact that the S@H page says that work units don't transfer...well, that's what it says. The fact is, it doesn't actually work that way. Work units will transfer to the team you are currently on, even though that contradicts what the S@H pages say. Trust me, that's how it works - we've lost, and gained, many work units due to members joining and leaving the team.
As for the reason behind the lackluster performance increases when the clock-rate increases, it basically works like this: think of memory bandwidth as a shared commodity (because it basically is in most architectures). The S@H client spills over the L1 and L2 cache (unlike some other distributed programs), and so makes use of a good deal of main memory, and does it often. The fact that one processor uses some main memory bandwidth means that there's less available for the other CPUs. In the case of the P3 (even worse, celeron based) SMP systems, memory bandwidth is very limited, so the fact that one CPU uses some of it up means the other has less available. Kinda like downloading two huge files from a fast server on a 56k modem.
<< I was a little paranoid after finding that often increasing processor speeds had little or no effect if the FSB was left the same. My dual 800/100 produces not many more than my dual 400/100. Like wise my dual 1000/133 produces no more than my dual 800/133. >>
That right there is experimental evidence of what I was talking about above 😉
I could go into more detail if you'd like, but that's the basic idea. The P4 architecture has a LOT more main memory bandwidth, so that is somewhat less of an issue.