Where in the world did you come up with 8C/8T = 6C/12T? Based on what data? Scaling doesn't change with more threads if the program actually uses 8 threads. An 8 core i5 760 would be 90-100% faster in a program that uses 8 threads instead of 4 vs. a i5 760. If you argue that few programs use 8 threads, then I agree, which is why BD being slower in 1-4 threads is critical.
Simple maths:
Speedup, given perfect scaling:
1 to 2 cores: 100%
2 to 4 cores: 100%
4 to 6 cores: 50%
6 to 8 cores: 25%
If an application scales to almost the max it can, from 4 to 6 cores you'll see a speedup of 45-50%, but going higher than 6 cores has almost always been more detrimental, so a speedup of 20% is more typical. Take into account that HyperThreading adds on average 20% performance in multi-threaded applications, and there you go. Even if we were to take into account perfect scaling, that's 25%, so only a bit more than putting HT on a Six-Core chip.
i7-990X is a not a SB architecture and it only has 6 cores. So how would it be 2x faster than a 4 core SB? Your comparison makes no sense.
How does me comparing using another Nehalem chip, only difference being it has HyperThreading, not make any sense? How does yours make more sense? It makes perfect sense. On a perfectly multi-threaded application a Core i7-990X would be 50% faster than a Core i5 750 at the same clock speed.