Pentium4 does have 3DNow!. It's called SSE, was introduced with the P-III and differs in some details, but serves the the very same purpose. AMD could then expand their 3DNow to support SSE, which they did, which I would assume non-game software developers would have taken as an invitation to program for one target.
3DNow!/SSE goes much further than "compensating for poor FPU performance". And actually, the K6 had 'faster' FPU than the PentiumII. The difference was that the Pentium's FPU was pipelined. Which meant that it handled a succession of non-dependant FP operations faster. As happens to be the case with FPU benchmarks, ...and the matrix operations of 3D transformations.
I believe however, that those days' versions of POVray ran faster on the K6 (at least that's what I've heard). As for FP benchmarks in those days, they were often simply misguiding people about the suitability of the K6 for FP applications.
Then K7 got pipelined FP units, and that discussion got old.
MMX has nothing to do with this.
Chipsets have nothing to do with this.
SSE2 is another major extension, that becomes wider to take advantage of increased hardware FP resources. AMD took their time to implement this, which meant that AMD for long suffered on encoding apps that took advantage of SSE2, but didn't bother much with 3DNow!.
SSE3 is a rather minor, almost insignificant addition, and cannot be compared in scope with the previous MMX, SSE and SSE2 extensions. I'm tempted to suspect it's just a desperate attempt from Intel to implement some benchmark upmanship. In the criminal tradition of Microsoft, "extend and exclude".
AMD's architectures have tended to be much more 'generally' fast and tolerant of previous code styles. While Intel have required lots of special tweaks and optimizations to get code running fast on new architectures. That has worked out fine for Intel on synthetic benchmarks, but has also forced Intel take an active role in applying pressure on coders to optimize for the latest Intel cludge.
How insidious and destructive this activity really is, and what purpose it actually serves, is maybe realized when contemplating Intel's compiler which apparently intentionally sabotages the code's performance on AMD.
Intel's purpose is simply to use their market lead to 'exclude' AMD. There were never any possibility of Intel supporting 3DNow!. It was not because they didn't have any use for it, as wrongly suggested previously in this thread. It was because it had to be 'something else'. To exclude AMD for a while. That else became SSE. AMD is put into a disadvantage by having to follow the market leader. That disadvantage becomes increased by AMD's more limited resources.
I'm sure Intel were thinking very long and hard about what they could do with EM64T to f' up things. ...And we may not have heard the last thing about that.
I don't really have any real insight into the hardcore PC-gaming and enthusiast market. But a couple of component retailers that I'm aquainted with, hardly even bothers with stocking Intel CPUs and MBs anymore. Noone "in the know" buys Intel any longer. I would thus presume that serious game developers do indeed optimize for AMD, since that seems to be a viable market. But there is no point in pushing '3DNow!'. MMX/SSE/SSE2 is obviously the way forward.