The physX looks cool, but it doesnt really change the enjoyment of the game for me though.
Which is basically the problem. Some games have used PhysX for physics, but most have it primarily for eye-candy only. In addition, it's been perfectly fast on console
CPUs, using Altivec (note that Jaguar has AVX--not quite as good in general, but maybe good enough). Games with non-configurable use of PhysX, because they're doing gameplay-integral physics with it, are the minority, and not what helps give PhysX its bad rep.
Physics can be tightly coupled to other computational needs of the engine, so waiting some 10^5 to 10^7 cycles is just not acceptable. Both previous gen consoles had the same sort of latency issues as our video cards do, even though the XB360 had a unified RAM setup.
The solution to physics is better CPUs, and better usage of them. On PCs, AVX2 holds far more promise than anything over on the GPU. Eventually, the CPU will be a fully heterogeneous device, with SISD and MIMD side by side, sharing an ISA (AVX2 being the first practical step for x86). For now, with GPU and CPU as separate entities, better to improve things on the CPU side, so that the major limits are cache and RAM access only. LLC could be 50 cycles, RAM 100-200 cycles, the GPU 1000+ cycles (much longer to actually push to or pull from GPU RAM, which physics will need to do, and longer if going over PCI-e, though pinning down a particular value would be difficult), and so on, not even counting any software overhead.
If they can do it on the GPU this time around (quite possible, the since the latency for a direct implementation will only be a bit more than that of RAM accesses), and actually benefit from that in some given game (for some games, it will hurt FPS, and the CPU will get used, anyway--8 CMP cores with 128-bit AVX support should not be scoffed at), don't expect that to transfer back to your PC. If it even could, it would only do so for a limited number of AMD APUs. Amdahl's law only works trivially, as described, if all operations, including all data accesses, take the same amount of time, and if youdon't care how long it takes to complete a work unit (see Gunther's Law). If you end up waiting on the other device for longer than is reasonable to set up rendering for a given frame, no amount of GFLOPS is good enough--it could complete the work instantly, from your POV, and still be too slow.