Have you heard of the "megahertz myth"? More is not always faster. You can only compare clock speeds within the same architecture (i.e. P4 to P4 or C2D to C2D.) As soon as you compare across processor types, many other factors come into play:
Amount of cache memory
FSB
Multiplier
Pipeline length
etc.
Assuming a single-threaded application, a lower clocked Core 2 Duo will annihilate a Pentium 4. The C2D has a short pipeline and more cache memory, so it can send large batches of instructions down a short pipeline, less frequently. Comparatively, the Pentium 4 sends fewer instructions down a much longer pipeline, so to compensate, it has to send instructions far more frequently (hence the higher clock rate).
At a certain point, there are heat limitations. You can only run a processor so fast; otherwise it starts to overheat. Since the C2D runs more efficiently on each clock cycle, it doesn't need to be clocked as quickly to achieve the same results.
You're spot on about multi-core cpus being better at multitasking. They also shine in single applications that use multiple cores.
This is very much a layperson's explanation, as I'm not an engineer.