PCgameshardware : Bulldozer? Please. Intel Confirms 8 Core SB-E For Q3

Page 4 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

HW2050Plus

Member
Jan 12, 2011
168
0
0
Hmm.

8 module/16 core Bulldozer or 8 core/16 thread Sandy Bridge?

I wonder clock for clock which will be better.
There are two problems with your question. First "better" in what respect? Price? Power consumption? Usability? Performance? And why you are asking for "clock for clock"?

Anyway. It is clear that a 8 Module / 16 Core Bulldozer will have significantly more performance than a 8 Core / 16 Thread Sandy Bridge without any question.
 

exar333

Diamond Member
Feb 7, 2004
8,518
8
91
There are two problems with your question. First "better" in what respect? Price? Power consumption? Usability? Performance? And why you are asking for "clock for clock"?

Anyway. It is clear that a 8 Module / 16 Core Bulldozer will have significantly more performance than a 8 Core / 16 Thread Sandy Bridge without any question.

If the AMD 16-core BD is not significantly faster (>50%) than an 8-core SB there is something seriously wrong.
 

HW2050Plus

Member
Jan 12, 2011
168
0
0
Its a natural question to ask IMO.

Personally I'm interested in knowing more about the IPC aspects of any given microarchitecture as well, always have been.
The problem is that this parameter is outdated in today's CPUs.

Therefore what you mean by IPC? IPC/socket right? As this is vastly determined by the core/thread count the core/thread count parameter replaces the outdated IPC, especially as IPC is more dependend on the software you run as on the CPU you run it on.

And which program you intend to use to determine IPC?
 

HW2050Plus

Member
Jan 12, 2011
168
0
0
If the AMD 16-core BD is not significantly faster (>50%) than an 8-core SB there is something seriously wrong.
Exactly!
8 BD cores + 8 CMT cores whereas CMT gives ~80% increase
vs.
8 SB cores + 8 SMT cores whereas SMT gives ~15% increase
That should be quite clear.
 

bandgit

Member
Mar 7, 2011
36
0
66
This is standard policy for all of our launches. It's just that people usually don't do this much speculation and rumor.

Well, again with all due respect, but AMD does not have the most spotless reputation in these cases. There is no point to keep bringing up the Catalonian capital city, but the performance expectations vs. reality turned out to be a train wreck unlike any other in memory. If there is speculation and rumor it can be largely attributed to the possibility that BD will be a very effective pure number cruncher, but will be another epic phail when it comes to the "real benchmarks" that are relevant to the vast majority of PC powerusers.

I'm not an Intel fanboi as I sincerely would be happy to lay my money where my mouth is and buy the first available Bulldozer but I have to know that I'm not buying a pig in a poke. :)
 

Accord99

Platinum Member
Jul 2, 2001
2,259
172
106
Exactly!
8 BD cores + 8 CMT cores whereas CMT gives ~80% increase
vs.
8 SB cores + 8 SMT cores whereas SMT gives ~15% increase
That should be quite clear.
Unless the SB core has 55% more throughput, which is quite a reasonable assumption given how much more powerful SB is as compared to current AMD cores and how much more power-efficient it is.
 

Elixer

Lifer
May 7, 2002
10,371
762
126
Well, if you are paying $16K for a "small workload server" then you are in the extreme minority. Average sales price for 2P servers is ~$5.

JFAMD, $5 ? Hook me up, I can send you $100 for 20 of these bad boys.! ;)
[/me starts the stopwatch to see how long before JFAMD corrects that typo...] ():)
 

HW2050Plus

Member
Jan 12, 2011
168
0
0
Unless the SB core has 55% more throughput, which is quite a reasonable assumption given how much more powerful SB is as compared to current AMD cores and how much more power-efficient it is.
Reasonable? First regarding current cores you obviously forget that Sandy Bridge has SMT while the current AMD cores don't have this feature. So first of all you have to strip off the SMT gain before you compare cores. And the SB SMT gain even increased with the SB unique feature of instruction trace cache for loops which helps to improve the gains from SMT because it reduces decoder stalls which is still a big problem of SB.

In addition the Bulldozer cores are faster than those of current Stars lineup.

So if in that comparison Bulldozer 8M / 16C vs. Sandy Bridge 8C / 16 T the Bulldozer does not Bulldoze the Sandy Bridge part regarding performance then AMD must have made some catastrophic bad failure.

The problem will be more that Intel will issue the high price extreme parts Sandy Bridge EN with 8 cores also in the consumer market and those Intel Extreme parts will then compete with AMD Zambezi with 4 modules / 8 cores. And there the Sandy Bridge EN will be clearly superior. As confirmed by AMD that it does not want to ready a 8M/16C part in the consumer market it will result in that AMD will either drop the performance crown it just gained or they will ready another part (e.g. 6M/12C) to give it a try which is unknown by now.

Therefore as clear it is that a 8M/16T part will be much faster than SandyBridge EN this will be likly only relevant in server market as there will be no 8M/16T client part from AMD.

Therefore in beginning of 2012 we will likly see Intel Extreme (aka Sandy Bridge EN) as the top performer for a high price followed by AMD parts with lower performance for lower prices and Intel 1156 parts with even lower performance and even lower prices. Only in the server market things will look different.

The comparisons between SB 4C/8T and BD 3M/6C or the according double of that (SB EN 8C/16T and BD 6M/12C) will be the very interesting performance comparisons with yet unclear outcome.

All the other comparisons with either Intel or AMD having the more modules/cores are quite clear. At same count AMD wins with the huge CMT vs. SMT advantage. At doubling the one which has the double amount wins also quite obvious.

But this SB 4C/8T vs BD 3M/6C will be nasty - 2 threads less for AMD but on the other hand the CMT advantage.
 
Last edited:

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
Exactly!
8 BD cores + 8 CMT cores whereas CMT gives ~80% increase
vs.
8 SB cores + 8 SMT cores whereas SMT gives ~15% increase
That should be quite clear.

Not exactly fair. 80% is a theoretical number. SMT is supposed to be up to 30%.

And the SB SMT gain even increased with the SB unique feature of instruction trace cache for loops which helps to improve the gains from SMT because it reduces decoder stalls which is still a big problem of SB.

Whatever few % it'll gain from that will be lost on AVX codes.
 

HW2050Plus

Member
Jan 12, 2011
168
0
0
Not exactly fair. 80% is a theoretical number. SMT is supposed to be up to 30%.
The 80% gain from CMT is an average from several benchmarks which ran on a simulated BD hardware according to AMD. The 30% gain from SMT is the top you can get from SMT but not the average. As you know there are also applications which run slower if SMT is enabled, means that SMT can also have a negative gain. That those benchmarks have been removed from the Anandtech benchmark sets does not mean they do not exist anymore.

Sure my "15%" is just my number and you can feel free to put any other number you like into that, e.g. "30%" and yes esp. SB can reach this 30% gain from SMT in some applications. That will not change the whole picture btw.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
When discussing about theoretical things, its good to keep even the numbers theoretical. There's no point of agonizing about little details because it'll easily made up by whatever it might be.

That said, 30% is quite common on server workloads. Client is different, but they get less with 2x cores.

As you know there are also applications which run slower if SMT is enabled, means that SMT can also have a negative gain.

Like 2% in low thread apps, in about 1 in a 1000 applications. SMP systems like the Core 2 Quad 9775 showed degradation compared to non SMP systems too.


Comparing SMT vs 50% more cores(only going to count applications that show clear gain from multi-threading to weed out Turbo and clocks/cache. Also going to minus few 3-5% for 2500 vs 2600 comparison):
http://www.anandtech.com/bench/Product/288?vs=287

http://www.anandtech.com/bench/Product/102?vs=203

X264 HD Encode Test 2nd pass:
SMT: 28%
1.5x: 47.5%

3DSMax R9 CPU Test:
SMT: 10%
1.5x: 20.8%

Cinebench MT bench:
SMT: 10%
1.5x: 34%

POV-Ray 3.7 Beta 23 SMP Benchmark:
SMT: 40%
1.5x: 43.9%

Par2 Multithreaded par2cmdline 0.4:
SMT: 10%
1.5x: 37.5%

Blender 2.48a Character Render
SMT: 12%
1.5x: 30.4%

Microsoft Excel 2007 SP1 - Monte Carlo Simulation
SMT: 35%
1.5x: 46.3%

Sorenson Squeeze 5 - Flash Video Creation
SMT: 21%
1.5x: 23.2%

WinRAR 3.8 Compression 300MB Archive
SMT: 15%
1.5x: 19.1%

x264 HD Benchmark - 2nd Pass
SMT: 21%
1.5x: 41.9%

7-zip Benchmark:
SMT: 32%
1.5x: 44.3%

Microsoft Excel SP1
SMT: 25%
1.5x: 43.8%

SMT average gain normalized for clock: 21%
1.5x cores gain: 36.1%

Out of 12 benchmarks:
-2 applications show SMT showing nearly high a gain as having 50% more cores
-6 additional applications show SMT showing greater than 50% gain of 1.5x more cores
-The rest 4 benchmarks show 0.25-0.35x the gain compared to having 1.5x the cores.
-Overall, Hyperthreading gives gains that are 60% of what's possible with having 1.5x the cores, or equal to having 1/3 more physical cores

In servers, the same websites that used to recommend turning Hyperthreading off now advises the reverse. Still, lot of the server users do the dumb assumption and disable it.

http://www.anandtech.com/show/2774/10

In some of the server comparisons, the sole factor of having Hyperthreading disabled would have put it even below the Opteron, where if it had enabled it, it could have been significantly ahead.
 
Last edited:

HW2050Plus

Member
Jan 12, 2011
168
0
0
@IntelUser2000
That is strange, especially that SMT shines in servers is quite clear for me. Server workloads generally scale nearly linear.

To sum up your findings. You got this 21% gain from SMT. But that includes already scaling issues of the applications you used.

I don't know if the 80% number from AMD already includes the scaling issues but obviously it doesn't.

So okay you convinced me, therefore okay I revise my statement of 15% to 30% (but for Sandy Bridge only). Looks like the loop trace cache works really well in Sandy Bridge.

But as I said that will not change the picture we will see with FX8000 vs. 2600 as 80% vs. 30% is still a large enough advantage of CMT vs. HT.

Maybe it would be a good idea for the next Bulldozer generation to add such a loop trace cache to get even more from CMT.
 

JFAMD

Senior member
May 16, 2009
565
0
0
If the AMD 16-core BD is not significantly faster (>50%) than an 8-core SB there is something seriously wrong.

You are not looking at it the right way. You need to look per socket, not by core. If the $800 Opteron is faster than the $800 Xeon, then AMD has a winner. Looking at things on a "per core" basis is not a good way to look at it because that is not how customers buy. Even as a consumer you probably say "I have $300 in my budget, what is the highest performance processor that I can get for the apps I run for those dollars."


Well, again with all due respect, but AMD does not have the most spotless reputation in these cases. There is no point to keep bringing up the Catalonian capital city, but the performance expectations vs. reality turned out to be a train wreck unlike any other in memory. If there is speculation and rumor it can be largely attributed to the possibility that BD will be a very effective pure number cruncher, but will be another epic phail when it comes to the "real benchmarks" that are relevant to the vast majority of PC powerusers.

I'm not an Intel fanboi as I sincerely would be happy to lay my money where my mouth is and buy the first available Bulldozer but I have to know that I'm not buying a pig in a poke. :)

So, let's address Barcelona. First, keep in mind that the 40% number people seem to throw around a lot was connected to server, NOT desktop (Randy was my boss and he was in charge of the server business at the time, not the desktop business.) Barcelona was 40% faster than Xeon in many benchmarks (STREAM, FP, most 4P benchamrks.)

Since Barcelona we introduced the following:

Shanghai, about a quarter early, ~200MHz over anticipated clock speed
Istanbul, about 2 quarters early, ~2-300MHz over anticipated clock speed
Magny Cours, about a quarter early, ~200MHz over anticipated clock speed

JFAMD, $5 ? Hook me up, I can send you $100 for 20 of these bad boys.! ;)
[/me starts the stopwatch to see how long before JFAMD corrects that typo...] ():)

Yeah, welcome to the internet. Nobody is perfect, my wife just pointed that fact out to me this morning. It is $5K. But if it was $5, you'd need to get in line behind me.
 

exar333

Diamond Member
Feb 7, 2004
8,518
8
91
You are not looking at it the right way. You need to look per socket, not by core. If the $800 Opteron is faster than the $800 Xeon, then AMD has a winner. Looking at things on a "per core" basis is not a good way to look at it because that is not how customers buy. Even as a consumer you probably say "I have $300 in my budget, what is the highest performance processor that I can get for the apps I run for those dollars."



So, let's address Barcelona. First, keep in mind that the 40% number people seem to throw around a lot was connected to server, NOT desktop (Randy was my boss and he was in charge of the server business at the time, not the desktop business.) Barcelona was 40% faster than Xeon in many benchmarks (STREAM, FP, most 4P benchamrks.)

Since Barcelona we introduced the following:

Shanghai, about a quarter early, ~200MHz over anticipated clock speed
Istanbul, about 2 quarters early, ~2-300MHz over anticipated clock speed
Magny Cours, about a quarter early, ~200MHz over anticipated clock speed



Yeah, welcome to the internet. Nobody is perfect, my wife just pointed that fact out to me this morning. It is $5K. But if it was $5, you'd need to get in line behind me.

So essentially you are saying that every current AMD cpu is a complete waste of money because a $200 SB will wipe the floor with it every time?
 

Vette73

Lifer
Jul 5, 2000
21,503
9
0
So essentially you are saying that every current AMD cpu is a complete waste of money because a $200 SB will wipe the floor with it every time?


No what if someone does not want to spend $200 on JUST the cpu? For $200 I can get a board, Quad AMD cpu, video card, and Ram.

Yea you will be faster in benchmarks. But in everyday use for most people the only differance will be the AMD person has more money in their pocket.
 

Edrick

Golden Member
Feb 18, 2010
1,939
230
106
No what if someone does not want to spend $200 on JUST the cpu? For $200 I can get a board, Quad AMD cpu, video card, and Ram.

Yea you will be faster in benchmarks. But in everyday use for most people the only differance will be the AMD person has more money in their pocket.

Yea, and for $100 you can get a complete P4 system off ebay, and save even more money. And it will run Office and surf the internet just fine.

Your arguement is flawed as a person can always buy a cheap system.
 

Vette73

Lifer
Jul 5, 2000
21,503
9
0
Yea, and for $100 you can get a complete P4 system off ebay, and save even more money. And it will run Office and surf the internet just fine.

Your arguement is flawed as a person can always buy a cheap system.


The system I was speaking of has SATA 6gb, usb3.0, etc... everything, or more, than Intel one would have.
Your P4 comment is worthless let alone outdated.
 

piesquared

Golden Member
Oct 16, 2006
1,651
473
136
You are not looking at it the right way. You need to look per socket, not by core. If the $800 Opteron is faster than the $800 Xeon, then AMD has a winner. Looking at things on a "per core" basis is not a good way to look at it because that is not how customers buy. Even as a consumer you probably say "I have $300 in my budget, what is the highest performance processor that I can get for the apps I run for those dollars."




So, let's address Barcelona. First, keep in mind that the 40% number people seem to throw around a lot was connected to server, NOT desktop (Randy was my boss and he was in charge of the server business at the time, not the desktop business.) Barcelona was 40% faster than Xeon in many benchmarks (STREAM, FP, most 4P benchamrks.)

Since Barcelona we introduced the following:

Shanghai, about a quarter early, ~200MHz over anticipated clock speed
Istanbul, about 2 quarters early, ~2-300MHz over anticipated clock speed
Magny Cours, about a quarter early, ~200MHz over anticipated clock speed



Yeah, welcome to the internet. Nobody is perfect, my wife just pointed that fact out to me this morning. It is $5K. But if it was $5, you'd need to get in line behind me.

Yeah, but I would bet large that that will be intel's pitch from their propaganda machine. They appear to already be admitting they will be way behind on performance/$ and performance/socket so they're just trying to move the goal posts again. Business as usual. I doubt that any fall for their tired old tactics though, in fact i'm gessing they are alienating many informed readers that read tech forums who are knowledgable enough to see through the BS.
 

Edrick

Golden Member
Feb 18, 2010
1,939
230
106
The system I was speaking of has SATA 6gb, usb3.0, etc... everything, or more, than Intel one would have.
Your P4 comment is worthless let alone outdated.

This thread is about BD and SB-E systems which are high end. Your comments about a sub $200 system was meaningless. There is always going to be something cheaper out there and I do not undestand why people always bring that up in threads discussing high end systems.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Anandtech results showed less gain on Sandy Bridge from SMT than on Nehalem, that of course depends on the application, but on average it was smaller.

http://www.computerbase.de/artikel/prozessoren/2011/test-intel-sandy-bridge/47/

Given the fundamental reason why SMT shows any benefit to begin with, anything Intel does to improve the microarchitecture (and compilers for that matter) such that pipeline stalls are occurring less frequently will result in decrease efficacy of SMT itself.

The value proposition of SMT diminishes in the limit of architecture improvements. It lives in a bubble so to speak, and the bubble is collapsing.

This is why CMT is viewed to be an approach with far more "opportunity to grow" as there are not fundamental limits to scaling in a CMT design, instead you have a continuously variable tradeoff to make between hardware sharing (resource contention) and die-size with the ultimate limit being indistinguishable from a simple CMP implementation.
 

JFAMD

Senior member
May 16, 2009
565
0
0
My point is about the size of the aggregate market, not the market that reads this board.

I spent $5K on a mountain bike, but the group of riders at my level represents well less than 1% of the total aggregate mountain bike market.

It would be just as pointless for me to say that unless someone has a 140MM Fox Vanilla RLC on their bike it is worthless and they have lost.

People here are making statements about the aggregate market in general using a proof point that applies to less than 1% of the total market. If people were to say "unless company X can win in benchmark 123 they will never break into the application Y leadership" then I wouldn't have an issue.

But the market is huge, with millions of PCs per quarter being sold, and there is no silver bullet that drives the whole market as one.

I would also bet that cost is a far bigger driver than performance. If this were not the case everyone on this board would be running a $1000 intel CPU, and I am guessing that even in a highly performance-minded world like this, more people are driven by performance (buying the best performance for thier budget) vs. just buying the highest raw performance out there.
 

Mopetar

Diamond Member
Jan 31, 2011
8,532
7,795
136
JFAMD, $5 ? Hook me up, I can send you $100 for 20 of these bad boys.! ;)
[/me starts the stopwatch to see how long before JFAMD corrects that typo...] ():)

Dude, you don't want a $5 server. Trust me. The last time I tried to buy one I got a hobo with an abacus in the mail. The FP ability of those things is crap.
 

Vette73

Lifer
Jul 5, 2000
21,503
9
0
This thread is about BD and SB-E systems which are high end. Your comments about a sub $200 system was meaningless. There is always going to be something cheaper out there and I do not undestand why people always bring that up in threads discussing high end systems.


Maybe you should have read who and what I was replying to. He said "every current AMD cpu is a complete waste of money because a $200 SB". I replied back with information about a current AMD chip since HE brought it up showing his blanket statement is false for the majority of people.

Reading skills, you ever learn them? :hmm: