Bulldozer was designed for MultiThreaded loads, core count and performance/watt. If you compare PileDriver to Core i7 Sandybridge (both second generation 32nm products) you will find that PD is faster in MT loads and performance/watt is acceptable(faster while consuming more).
http://www.anandtech.com/bench/Product/697?vs=287
I have said that before, the problem with AMD is they lack behind in manufacturing not in design.
Yes, that is because after all that time, since the first dual core Desktop CPU, desktop workloads just starting to be Multithreaded. We where stagnated with a 10% IPC increase per year and 15-20% increase in performance due to being oriented to single threaded increases.
Im not against the single thread increases but everyone knows that is the most difficult thing to do. It is very easy to come to negative returns trying to get even 1% more IPC.
That was one of the most important reasons to go MultiCore and MultiThreaded.
The FUSION through the HSA is still their future and they are committed to it. The only problem i see here is the fact that it takes more time for this to give them what they were targeting for.
Yes, they spend 5-bill and this investment haven't payed off as of yet, but i believe we see the first steps in to the near future with Llano and Trinity.
I dont want to play the IPC advocate for the BD but as you and I and a lot of us know here, IPC is Application Driven. You can have an IPC increase in one application and a decrease or not in another.
Granted IPC have fallen in the majority of desktop Legacy code with Bulldozer but it was increased in SIMD and others.