Bulldozer has 33% less ALUs than K10

Page 4 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Cogman

Lifer
Sep 19, 2000
10,286
147
106
Yes, exchange is one of those "niche applications" that could be impacted.

http://blogs.amd.com/work/2010/01/21/it’s-all-about-the-cores/

I will not argue the fact that there are places that SMT can bring a throughput increase. I will, however, argue that it is the best way to deliver more throughput. Anytime I see an enterprise software application recommend turning off a feature in order to provide more stability or accuracy, it's a big red flag for me.
I don't know if it is the best way. More, it is a way.

HT benefits primarily from different threads using different areas of the CPU (SIMD instructions, FPU instructions, ALU instructions, and movement instructions to name a few.) The reason some programs won't benefit from it is, most programs don't begin to use the instruction set on a cpu.

It doesn't make sense to break up an application that primarily uses one part of the CPU in a hyper threading environment. It will still only use that one part of the cpu, causing collisions galore (best case, losing no performance, worst case, seeing a decrease in speed.) Most business applications fall into that category. The FPU is simply rarely touched in a business environment (unless they are doing stock predictions)

Now, if you have an app the uses, fairly heavily, multiple modules from the CPU, then threading it in a hyperthreading situation makes sense. Each thread will be doing different things, giving a good chance that one thread will start using the FPU while the other is using the ALU.

I do agree with you though, a properly coded multi-threaded application shouldn't suffer stability issues with or without hyperthreading. It may, however, see no performance benefits, and even a performance decrease.
 

Stefan Payne

Senior member
Dec 24, 2009
253
0
0
Is this a problem or not ?
No, 'cause there are 2 ALU Cores per cluster.

The Problem is how to define ie a Quadcore:
Should we compare a two cluster 'Bulldozer' CPU with an actual quad K10 or not?
Especially since the 2 cluster Bulldozer is about the size a 2 core K10 (with L3) would be.

Another question is the single thread performance of that upcoming AMD architecture would be.
Then why do some server software vendors recommend turning HT off?
For the same reason they switch off the turbo modes.

The performance isn't predictable, so they switch it off, so they know what would happen.

Oh and btw: SMT doesn't always gain performance, sometimes it's lost, especially with the intel implementation, wich just seems to fill gaps in the pipeline.
 
Last edited:

evolucion8

Platinum Member
Jun 17, 2005
2,867
3
81
AMD latest architecture always has been very competitive against latest Intel architecture in terms of FPU performance, and since 90% of applications use integer instead of FPU, makes a lot of sense adding more ALU cores than FPU cores.
 

DrMrLordX

Lifer
Apr 27, 2000
23,232
13,321
136
AMD latest architecture always has been very competitive against latest Intel architecture in terms of FPU performance, and since 90% of applications use integer instead of FPU, makes a lot of sense adding more ALU cores than FPU cores.

Two points:

1). Back in the k6/Pentium 2 days, that most certainly was not the case. AMD lagged in FPU performance all the way until the release of the Athlon. Intel exploited that fact to the hilt, especially in games.

2). If Intel regains the huge FPU lead they had in the pre-Athlon days, either in raw x87 performance or SSEx/SIMD performance, you can expect that they will be doing everything they can to manipulate software development back in the direction of FPU-dependence.
 

Stefan Payne

Senior member
Dec 24, 2009
253
0
0
Yeah but nowadays the FPU performance doesn't seem to matter much.
That's what GPUs are for, so Integer matters more.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Should we compare a two cluster 'Bulldozer' CPU with an actual quad K10 or not?

You make price/performance comparisons with the apps that are relevant to you.

I don't care if they compare virtual core count to bulldozer module count, or cores vs cores, or threads vs threads...it all comes down to what performance at what cost (and cost can include platform costs as well as power consumption costs if those metrics are of relevance to the consumer as well).

1). Back in the k6/Pentium 2 days, that most certainly was not the case. AMD lagged in FPU performance all the way until the release of the Athlon. Intel exploited that fact to the hilt, especially in games.

Having owned a K6-2 chip back in those days I would actually argue it was AMD that was exploiting the fact that many synthetic desktop benchmark programs at the time were heavily integer dependent and so AMD performance-rated their K6 chips based almost entirely on integer performance while their FPU sucked wind.

Yeah I got burned on that one when my program of relevance (Gaussian94 at the time) performed 1/2 as well as an equivalent cost Pentium rig...my upgrade from K6-2 was to a PII. (and then athlon after that for the reasons you mentioned)
 

Accord99

Platinum Member
Jul 2, 2001
2,259
172
106
Exchange server, for one.
Going back through Anandtech's IT reviews that compare HT and non-HT scores, we see that HT does:

SAP: +31.6% (estimated)
OLTP Dell DVD store: +20.7%
OLTP Oracle Calling Circle: +35.0%
Decision Support SQL Server 2005: +22.0%
Website MCS eFMS 9.2: -13.2% (dual 5570s); +14.6% (single 5570)
3DS Max 2008: 18.9%
MS Exchange 2007: -12.6%
ESX 3.5: +17.4%


And adding two cores from the 2384 to the 2435 for the common benchmarks:

OLTP Oracle Calling Circle: +22.1%
Decision Support SQL Server 2005: +41.5%
Website MCS eFMS 9.2: -6.9%
3DS Max 2008: 1.5%
ESX 3.5: +24.4%

So, we can see that HT generally has a measurable increase in throughput; that for MCS eFMS the 8 thread limit causes performance decreases for both HT and increasing cores, MS Exchange suffers a loss but there are no further test to determine if the cause is HT or something similar to MCS eFMS; throughput benefits from increasing cores is also highly dependent on application.
 

Accord99

Platinum Member
Jul 2, 2001
2,259
172
106
I will not argue the fact that there are places that SMT can bring a throughput increase. I will, however, argue that it is the best way to deliver more throughput.
Nobody argues that it, however SMT is clearly the most cost-effective method of increasing throughput in terms of die size and power usage. Otherwise, it wouldn't be found in so many server CPUs and even in console CPUs.
 

Schmide

Diamond Member
Mar 7, 2002
5,791
1,101
126
Intel owns a compiler, they can do anything that they want.

I would disagree. To me they have a conflict of interest by licensing the technology (MMX/SSE etc) then ignoring it when it's convenient; thus making their product artificially superior to their competition. It just happens that they do have a superior product at the moment, but the margins could be less if they played fair.

With regards to HT and performance, I often wonder if the threads sharing the L1/L2 caches provide some level benefit over the keeping execution pipelines full. It seems bulldozers division of resources around the L2/decode would provide some of this theoretical benefit as well.
 
Last edited:

VirtualLarry

No Lifer
Aug 25, 2001
56,587
10,227
126
Otherwise, it wouldn't be found in so many server CPUs and even in console CPUs.

What console CPUs support SMT? The Power-something chip in the Xbox360 doesn't have SMT, does it? It has three real cores.

The Cell CPU in the PS3 doesn't have SMT either.
 

Accord99

Platinum Member
Jul 2, 2001
2,259
172
106
What console CPUs support SMT? The Power-something chip in the Xbox360 doesn't have SMT, does it? It has three real cores.
The Xbox 360's Xenon CPU is a triple core and each core supports two simultaneous threads:

http://en.wikipedia.org/wiki/Xenon_(processor)
http://www.anandtech.com/show/1719/3

The Cell CPU in the PS3 doesn't have SMT either.
The Cell's single PPE also supports two simultaneous threads.

http://www.anandtech.com/show/1647/3
 

JFAMD

Senior member
May 16, 2009
565
0
0
I do agree with you though, a properly coded multi-threaded application shouldn't suffer stability issues with or without hyperthreading. It may, however, see no performance benefits, and even a performance decrease.

You hit the nail on the head.

The performance benefit of SMT can help cover up the sins of bad code. I just don't think anyone understands to what degree.
 

JFAMD

Senior member
May 16, 2009
565
0
0
Nobody argues that it, however SMT is clearly the most cost-effective method of increasing throughput in terms of die size and power usage. Otherwise, it wouldn't be found in so many server CPUs and even in console CPUs.

So if it is the most cost-effective, why are Intel processors priced above AMD?

Either it is less cost-effective or Intel is sticking it the customers. Take your pick, neither is a pleasant outcome.
 

dmens

Platinum Member
Mar 18, 2005
2,275
965
136
So if it is the most cost-effective, why are Intel processors priced above AMD?

Either it is less cost-effective or Intel is sticking it the customers. Take your pick, neither is a pleasant outcome.

I really hope AMD is not subsidizing your trolling because that post is plain stupid.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
You make price/performance comparisons with the apps that are relevant to you.

I don't care if they compare virtual core count to bulldozer module count, or cores vs cores, or threads vs threads...it all comes down to what performance at what cost (and cost can include platform costs as well as power consumption costs if those metrics are of relevance to the consumer as well).

I'm sorry, but I have to say this paragraph questions the intelligence of your post. Do you not remember the prices when the Athlon 64 first came out?: http://anandtech.com/show/1164/5

Also prices before the Core 2 Duo release?

The pricing is nearly entirely determined by its competitive position. AMD is not pricing Phenom II X6's at $300 for top SKU because they want to or "they want to be nice to the customers" they know that's what makes the chip competitive with the i7's.

Thuban=346mm2
Nehalem=263mm2

You are PERFECTLY aware that bigger die sizes generally result in greater defects and greater costs.

Though its questionable whether SMT on Nehalem is cost effective compared to other MT solutions since cores don't take too much of the die, the truth is SMT is one of the most effective ways of increasing performance.
 

Jovec

Senior member
Feb 24, 2008
579
2
81
So if it is the most cost-effective, why are Intel processors priced above AMD?

Either it is less cost-effective or Intel is sticking it the customers. Take your pick, neither is a pleasant outcome.

Let's be real.

Intel is winning the IPC war, so they can use clock speed, turbo, and HT to be competitive (and in many cases superior - see gaming) with less real cores. Look at something like the Core i3 530 versus the Athlon II x4 620. 4 real cores are barely staying ahead in parallel tasks like media encoding. AMD is selling 4 cores on the desktop for under $100 because they don't currently have the IPC to compete with Intel's 2c/4t chips at that price bracket. You can be sure if AMD had better IPC they'd be selling 2c/4t to save die space, allowing them to either boost profits or lower prices (Assuming there aren't license or patent issues).

If AMD had better IPC Intel would likely counter by moving 2c/4t to all entry chips and providing 4 real cores (and 4c/8t as appropriate) on everything above the $100/$125 price point.

What's also missing from this thread is the elegance of HT as it addresses the realities of software development. Most consumer/desktop apps are not well optimized due to limited time and budget. When most apps are user-input limited, HDD limited, or bandwidth limited, the app just has to be fast enough to keep up. Outside of select things likes media encoding apps, this likely isn't going to change much in the future. Limited resources are usually better spent on new features, bug fixes or even the next project. This is a large reason why HT does so well on mixed workloads and typical consumer apps.

The converse is true in the server space, where time equals money. Developers have spent and will continue to spend a lot of time on parallel code and optimization. Idle hardware is often times wasted money, and software developers compete indirectly with hardware manufacturers for our upgrade dollars. This is why HT tends to see much smaller gains here. There is also more than simply IPC for server performance (moreso than for desktop at least) and AMD has always done well here.

Now I'm not knocking AMD here at all. I've got a 955, 550, 620, 705e, and 2 240s, and I'm trying to talk myself out of buying a 1090t :), but the single thread performance difference (and brand recognition) is what allows Intel to push HT. AMD's pricing and core count strategy is a function of the performance of their CPUs.
 
Last edited:

JFAMD

Senior member
May 16, 2009
565
0
0
Let's be real.

Intel is winning the IPC war, so they can use clock speed, turbo, and HT to be competitive (and in many cases superior - see gaming) with less real cores. Look at something like the Core i3 530 versus the Athlon II x4 620. 4 real cores are barely staying ahead in parallel tasks like media encoding. AMD is selling 4 cores on the desktop for under $100 because they don't currently have the IPC to compete with Intel's 2c/4t chips at that price bracket. You can be sure if AMD had better IPC they'd be selling 2c/4t to save die space, allowing them to either boost profits or lower prices (Assuming there aren't license or patent issues).

If AMD had better IPC Intel would likely counter by moving 2c/4t to all entry chips and providing 4 real cores (and 4c/8t as appropriate) on everything above the $100/$125 price point.

What's also missing from this thread is the elegance of HT as it addresses the realities of software development. Most consumer/desktop apps are not well optimized due to limited time and budget. When most apps are user-input limited, HDD limited, or bandwidth limited, the app just has to be fast enough to keep up. Outside of select things likes media encoding apps, this likely isn't going to change much in the future. Limited resources are usually better spent on new features, bug fixes or even the next project. This is a large reason why HT does so well on mixed workloads and typical consumer apps.

The converse is true in the server space, where time equals money. Developers have spent and will continue to spend a lot of time on parallel code and optimization. Idle hardware is often times wasted money, and software developers compete indirectly with hardware manufacturers for our upgrade dollars. This is why HT tends to see much smaller gains here. There is also more than simply IPC for server performance (moreso than for desktop at least) and AMD has always done well here.

Now I'm not knocking AMD here at all. I've got a 955, 550, 620, 705e, and 2 240s, and I'm trying to talk myself out of buying a 1090t :), but the single thread performance difference (and brand recognition) is what allows Intel to push HT. AMD's pricing and core count strategy is a function of the performance of their CPUs.

I am a server guy, I can't speak to the desktop market.
 

JFAMD

Senior member
May 16, 2009
565
0
0
I really hope AMD is not subsidizing your trolling because that post is plain stupid.

AMD Opteron 6174 = SPEC INT score of 386 @ $1165
Intel Xeon X5680 = SPEC INT score of 381 @ $1663

This is not a troll, it is a simple fact of life. Both of these processors are giving you 12 total threads, the AMD is giving you slightly higher performance. The Intel is 42% more expensive.

Plus, their 32nm should be less expensive, right?

There are two different design philosophies, big cores with HT or lots of smaller cores. Lots of smaller cores scale better under heavy loads. Bigger cores with HT is better if you have more cores than threads (i.e. single threaded apps.)

But you can't say that it is more cost-effective if the same level of performance is 42% more expensive. That just doesn't work.
 

alyarb

Platinum Member
Jan 25, 2009
2,425
0
76
Somewhere somebody got the word "cost" confused with wasted die area and price, but the architectures and business models of these two companies are so different that you can't just say AMD is losing money by choosing to have a large native multicore without SMT simply because intel says it saved them die space. It was space they already spent by having a robust, high-IPC core. AMD's core, altogether, is less complex and would probably be smaller. We all know that bigger CPUs are more expensive to make than a smaller CPU, but this purely geometric comparison only works if they're on the same manufacturing process. It doesn't make sense to assume AMD is losing money because they chose to have a large die with 6 full cores at the price points they've chosen, and it doesn't make sense to compare 45nm SOI to 32nm high-k. In terms of the age of each process and the cost to implement them over time, it should become clear instantaneously that Thuban and Magny were many, many times cheaper to introduce than Gulftown and clarkdale. They are using an old process in an old facility probably staffed by old personnel and yet they have a compelling product. AMD doesn't build a new fab every 5 years and update their process every other year like intel. They have one-thousandth the resources intel has and yet they have a product that was cheaper for them to make, cheaper for us to buy, and architecturally unique enough to bring definite advantages in well-threaded workloads.
 
Last edited:

Cogman

Lifer
Sep 19, 2000
10,286
147
106
AMD Opteron 6174 = SPEC INT score of 386 @ $1165
Intel Xeon X5680 = SPEC INT score of 381 @ $1663

This is not a troll, it is a simple fact of life. Both of these processors are giving you 12 total threads, the AMD is giving you slightly higher performance. The Intel is 42% more expensive.

Plus, their 32nm should be less expensive, right?

There are two different design philosophies, big cores with HT or lots of smaller cores. Lots of smaller cores scale better under heavy loads. Bigger cores with HT is better if you have more cores than threads (i.e. single threaded apps.)

But you can't say that it is more cost-effective if the same level of performance is 42% more expensive. That just doesn't work.

Intel can charge whatever they want. pricing isn't really based so much on how much it cost them to produce a CPU, it is more based on historical pricing and how much they can get out of people. (AMD does the same thing, the reason they release similar performance at lower prices is because they have to give people a good reason to switch over.)

Intel's Top-o-the-line cpu costs just as much to produce as the bottom-o-the-line cpu (using the same architecture).
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
So if it is the most cost-effective, why are Intel processors priced above AMD?

Either it is less cost-effective or Intel is sticking it the customers. Take your pick, neither is a pleasant outcome.

Intel charges what they can for their processors. Are you trying to claim AMD doesn't?
 
Last edited:

jvroig

Platinum Member
Nov 4, 2009
2,394
1
81
I agree. That post crossed the line from marketing to trolling / flame baitng.
I would respectfully disagree. This is most likely a very subjective issue, but I believe he is being simply passionate about it, especially being one of the faces of AMD.

I am not taking sides, but the conversation would have been better had "stupid" not entered the thread and if a wrong point was made, refuted with something better than calling it "trolling" or "stupid", which does little to improve the conversation or enlighten other readers. I am no expert, and cannot take sides, but the way the thread has unravelled, JFAMD seems to cite what he presents as facts. If it is wrong, a rebuttal such as the one most recently made by Cogman is good enough, something within that line.

I do agree that pricing considerations go beyond performance and cost, since the best pricing strategy is to get the most that the consumers are willing to buy it for, which is what led to so many varied market segmentation techniques.

But the point of JFAMD seems to be that AMD is outclassed in both resources and process tech, yet are able to remain competitive - and if the quotes of JFAMD are to be believed (SPEC data), then they do seem to be at parity, or even in the favor of AMD in some cases. So he said that despite being outclassed in resources (Intel being a behemoth) and process tech (32nm vs 45nm), AMD still maintains to be somewhat at parity and at a lower cost, and then asks why Intel can't do the same, especially if 32nm gives them such benefits, unless they keep all the benefits to themselves and don't share it with customers.

As I see it, it is a fair enough argument, and one that is not impossible at all to answer, as Cogman has done.
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
27,411
16,266
136
I would respectfully disagree. This is most likely a very subjective issue, but I believe he is being simply passionate about it, especially being one of the faces of AMD.

I am not taking sides, but the conversation would have been better had "stupid" not entered the thread and if a wrong point was made, refuted with something better than calling it "trolling" or "stupid", which does little to improve the conversation or enlighten other readers. I am no expert, and cannot take sides, but the way the thread has unravelled, JFAMD seems to cite what he presents as facts. If it is wrong, a rebuttal such as the one most recently made by Cogman is good enough, something within that line.

I do agree that pricing considerations go beyond performance and cost, since the best pricing strategy is to get the most that the consumers are willing to buy it for, which is what led to so many varied market segmentation techniques.

But the point of JFAMD seems to be that AMD is outclassed in both resources and process tech, yet are able to remain competitive - and if the quotes of JFAMD are to be believed (SPEC data), then they do seem to be at parity, or even in the favor of AMD in some cases. So he said that despite being outclassed in resources (Intel being a behemoth) and process tech (32nm vs 45nm), AMD still maintains to be somewhat at parity and at a lower cost, and then asks why Intel can't do the same, especially if 32nm gives them such benefits, unless they keep all the benefits to themselves and don't share it with customers.

As I see it, it is a fair enough argument, and one that is not impossible at all to answer, as Cogman has done.

Exactly. Truth is a valid defense.