<< Why is it beneficial to pack as many transistors into a cpu as possible? >>
Processor logic is implemented as transistors. All the processing units, pipelines, etc. and cache are made up of transistors.
<< why aren't CPU cores the size of floppy disks, with hundreds of millions of transistors? >>
Because they would be horrendously expensive, that's why. Consider the size of a processor now, and how much it costs. Then multiply that out to the size of a floppy. With technology as it is, a larger die also means larger probability of defect or failure.
In general, the more transistors, the more processing power a CPU has. Of course things like architectural differences may hide this (a Pentium 4 1.6GHz has more transistors than an Athlon XP 2000+ (1.6GHz), yet the 2000+ just smacks it silly in terms of performance). As for why CPU cores aren't the size of floppy disks, remember that silicon is VERY expensive. Just think, if a tiny 1.5 square centimetre CPU core (as found on the Pentium 4) can cost $200, something the size of a floppy disk (it's about 60 square centimetres, or 40 times larger) would cost $8000, assuming a linear cost to area ratio. I don't know about you, but I can't afford $8K for a processor, regardless of how fast it is.
Another reason why cores are so small is heat: the more transistors, the more heat. A floppy-sized CPU core would use incredible amounts of power and would require a MASSIVE cooling solution to prevent a meltdown. Needless to say no commercially available computer cooling solution exists for such a theoretical processor.