<< suppose you made a P4 that's twice as big as a normal one -like just doubling the amount of everything... and then you used supercooling via peltier or cryogenics or whatever... wouldn't you be close to the performance of a 5ghz chip? >>
I'm not exactly clear on what you're saying, but...
If you were to double the number of transistors by doubling everything in the P4 (number of execution units, number of instructions issued/retired per cycle, reorder buffer, size of register file, size of trace, L1, and L2 caches, etc), you would probably lose performance. First of all, it would probably decrease the clock rate, since it would increase the critical path length due to an increase of the size/number of ports of the register file and reorder buffer, assuming the transistor technology and number of pipeline stages remains the same. Secondly, x86 lacks instruction-level parallelism, largely due to its variable instruction size (1 to 15 bytes) and the large number of addressing modes. Beyond 3-way issue superscalar, x86 processors quickly experience diminishing returns, so a 6-way issue CPU would not be much faster than a 3-way issue CPU, despite drastically increasing the number of transistors.
An increase in the number of transistors has to be used efficiently to experience any increase in performance. x86 processors have used as much RISC techniques as possible, such as pipelining, register renaming, and out-of-order superscaler execution. Besides increasing clockspeed, the best bet for increasing performance for x86 CPUs lies in thread-level parallelism, such as on-chip multiprocessing and multi-threading.