I believe it still boils down as to whether the application is written to be multiprocessor or not. I have a couple dual boxes (1gigs, 933s, and 600Katmais), mostly for video and rendering.
A couple of the video editing programs use the multiprocessors, and the extra RAM (Adobe automatically grabs half the available RAM). It makes a huge difference in rendering times compared to the single processor boxes I've tried (1gig and 933).
There is at least one program out that allows you to assign procesor affinity (either or both), and it hasn't made much of a difference when I've tried single-processor programs (like SETI)assigned it to both processors.
My understanding and impression is that the administrative time given up by the operating system to assign and queue the process adds to the overall processing time, and may have the effect of causing the (single-processor) program to acutally run slower.
For Windows programs written for the multi-processor environment, the overhead is reduced because some/most/all of the priority and affinity assignments are coded into the program, reducing or eliminating some/most/all of the executive management overhead.
I believe it's a different story for the *nix boxes because the threading architecture of the OS is completely different, as are the structures of the applications running on 'em.
I haven't been on the software end of the stick for a long time, so I could certainly be wrong on some of the details (maybe even the whole thing).
I think the end result is that if you have multiprocessor aware applications and/or tend to execute many applications concurrently, then the MP will help....or at least, not hurt (always assuming that the OS is MP capable). If you don't have an MP capable application, and you tend to run one or two things at once (even if they're real crunchers), the MP will not help, and might even be a tad slower.
FWIW
Scott