I am not sure what would be the best idea between having a second chip on the video card (which people already do for SLI on a card solutions), or if you could add some additional hardware on the front end to allow the GPUs pipielines to do the calculations. So, look at ATIs next generation cards with 48 or so unified shader pipelines, and maybe in addition to doind vertex, geometry, and pixel shaders, they can also do physics calculations. Sure you are gonna need a bigger die, and more memmory for everything to work smoothly, but its gotta be cheaper and more efficient than a 2 card solution. In a room with little physics you will have your physics card sitting idle, but if its a combined solution this jsut means that you have more RAM and ALU bandwidth open to the visual calculations. On the flip side, when a big explosion or something happens you can borrow some bandwidth for awhile. If you look at the tech demos they show tons of explosion all at the same time.
Also, since you already have the geometry information being worked on in the GPUs memmory the CPU never even has to get involved.