Bulldozer was created for Throughput, the entire Module has more execution units (Int and FPU) than a single Intel Core with HT.
Just because the BD architecture has a smaller INT execution unit it doesnt mean AMD cannot create a wider INT Core. Star (Llano) had a wider core than BD and yet BD has higher ST performance.
If they will go with wider INT cores (3-4 ALUs vs 2 ALUs in BD mArch) in order to implement SMT within the module then IPC and ST performance will increase tremendously. Throughput and efficiency will also skyrocket but the Module die size will be bigger.
That will be compensated from the higher density and low power consumption of the 14nm FF node.
Kaveri has 4 ALUs per module. Haswell has 4 per core as well. Haswell has much more FP power too with wider units.
That said, adding more units brings diminishing returns
I'm not saying AMD is going to magically pull out 25-30% IPC gain from the new Arch...but personally I'd think of anywhere between 10-15% as realistic...and thus competetive if the price is right. About + 10% on Excavator IPC @ 3.8/4Ghz clock + an iGPU strong enough to master 1080P gaming? To me that would be an all kill.
Sure...Intel will still be ahead heaps in raw CPU power...but for the medium to low ranged gamer (as one example for a consumer base) this is unimportant as their budget wouldn't allow them to buy the fattest Intel CPUs, anyway if they still need a decent GPU.
Agree.