NO new info there that hasn't been said for sometime....
ONe thing I can say is the superpi test at 16M is a bit deceiving....Yes of course at that test 16M it wont show much difference because of the size (yes I know 16M doesn't mean 16mb)...If you try lower ones like 1M and 2M you will see differences...More simplified test. We saw this with P4 versus a^$'s back in the day due to INtel higher cache pools
The advantages are apps that can fit entirely in or portions of the app can fit entirely in the cache pool. Apps that have redundancy, such as simple calculating apps also do well...
I had tested this and posted this on this forum for you guys before...
I showed 2 identical speed chips E6400 (2mb cache) at 3.4ghz versus E6600 (4mb cache) at 3.4ghz performed identical when running a certain series of Folding at Home work units with one instance running...**note this proceeded the use of the SMP program for F@H***
When I started a 2nd instance the E6400's times per WU doubled. Basically It took twice as long to get 2 units done, leaving the same points per unit of time...
However the E6600 with a second unit was able to do both instances at the same time per frame it was doing one instance...Ultimately I was able to double my output with an E6600 at same overall clock speed, as long as I was getting a certain series of units.
So cache can make a HUGE difference with the right application..