Having hardware T&L moves more of the rendering pipeline onto the graphics card and off of the CPU. The less graphics code the CPU has to process, the more free time is has for physics, AI and the rest of the game code.
"It can't be video quality, because my V5 looks better than my Geforce 2 GTS did in a wide variety of games, old and new."
Do you use FSAA? If not then that is a bit odd. If so it makes sense. It
can be video quality but very few games utilize the power of T&L to enhance visuals, they mainly rely on it for a speed boost. MDK2, Giants and Sacrifice look better in hardware T&L mode then they do running software. MDK2 you can chose to run hardware T&L on any video card, it swaps over to the hardware based lighting engine which does look better(more accurate then the "hack" used for software based lighting, all the older games that I can think of use software lighting). Giants, if you are running a V5 and have the proper drivers installed, you can also enable hard T&L(uses SSE/3DNow! to complete the same tasks that a hardware T&L unit would) but from what I've seen it looks a bit different(may just be LOD bias settings).
"Most of us are running 1GHz plus these days, wouldn't the cpu have more firepower to do this than a video card based engine?"
Click here, GHZ CPUs are not even close to the power level of dedicated T&L hardware. Think of it kind of like a dedicated MPEG2 decoder(DVD add in board). Because it is built to do a specific task, it does that task extremely quickly unlike your general purpose CPU. Unlike a MPEG2 decoder game developers can continue to keep adding polys and cranking up the strain, even current hardware T&L wouldn't be phased too much, those MDK2 scores linked above are nowhere near straining the hardware T&L of the GF/Radeon boards, the T&L unit is processing all of the data and "twiddling its' thumbs" waiting for the CPU to handle the game code. Because of this situation developers
could increase the polygon complexity, and hence detail, of models by a large margin for those with hardware T&L though those who still are relying on CPU power may have to turn the detail down a bit.
The GeForce3 takes this even further doing more of the tasks that a CPU has done in the past, even on GF/Radeon level hardware. It's likely that we won't see games using this hardware potential until at least shortly after the launch of the X-Box though.