[HARDOCP] AMD gives some answers* regarding FX

ed29a

Senior member
Mar 15, 2011
212
0
0
Nothing more than marketing speak. Sorry, but it all sounds like a big load of male cow feces.
 

Ferzerp

Diamond Member
Oct 12, 1999
6,438
107
106
I did not see any answers there. ed29a's comment seems spot on to me. I saw a bunch of words, but no answers.
 

Vesku

Diamond Member
Aug 25, 2005
3,743
28
86
"It is also important to note that the "Bulldozer" architecture is configured and optimized for server throughput. The two integer execution cores present in Bulldozer are designed to deliver area- and power-efficient multi-threaded throughput."

- Considering they were being asked about desktop FX parts I find this the most interesting part of the PR response.
 

Ferzerp

Diamond Member
Oct 12, 1999
6,438
107
106
Except we saw that while less unattactive there, it is still unattractive. They also claim power-efficiency. So, I think they are just smoking crack.
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
"It is also important to note that the "Bulldozer" architecture is configured and optimized for server throughput. The two integer execution cores present in Bulldozer are designed to deliver area- and power-efficient multi-threaded throughput."

How much area and power benefit does Bulldozer really give?

I was quite surprised when the 8 core Bulldozer was announced at 315mm2 @ 32nm.

Lisbon (a hexcore) was 346mm2 @ 45nm (with each CPU core having its own floating point)

What am I missing here?

Surely with a 32nm die shrink would have produced a eight core Phenon II much smaller than 315mm2
 

TuxDave

Lifer
Oct 8, 2002
10,571
3
71
How much area and power benefit does Bulldozer really give?

I was quite surprised when the 8 core Bulldozer was announced at 315mm2 @ 32nm.

Lisbon (a hexcore) was 346mm2 @ 45nm (with each CPU core having its own floating point)

What am I missing here?

Surely with a 32nm die shrink would have produced a eight core Phenon II much smaller than 315mm2

346 * 8/6 (assuming they just slap on 2 more cores) = 461mm2
315/461 = 68% area scaling

It doesn't sound super bad or anything...
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
346 * 8/6 (assuming they just slap on 2 more cores) = 461mm2
315/461 = 68% area scaling

It doesn't sound super bad or anything...

Yep, then if we shrink that 461mm2 down to 32nm don't we get roughly half that?

~230.5mm2 die size for a octacore Phenom II shrink?

In Bulldozer's defense it does include the Northbridge in that 315mm2 32nm die.

But how much area does that really take up:

Bulldozer_Die_size.png


What else could be contributing to this large 315mm2 die size? I'm confused :(
 

TuxDave

Lifer
Oct 8, 2002
10,571
3
71
Yep, then if we shrink that 461mm2 down to 32nm don't we get roughly half that? ~230.5mm2 die size for a octacore Phenom II shrink?

In Bulldozer's defense it does include the Northbridge in that 315mm2 32nm die.

But how much area does that really take up:

Bulldozer_Die_size.png


What else could be contributing to this large 315mm2 die size?

Well according to geometry if you shrink all dimensions by 0.71 (32/45) you SHOULD get 50% area reduction but since nothing is 45nm and nothing is 32nm nowadays, who knows what you'll get. Using the same fakey math, Nehalem to Westmere was 60% and that's as close as you get to a process shrink (very little features were added).

So it will be an accomplishment to get 50% total die reduction (not impossible) but maybe someone can show me some examples of it. I'm too lazy to go beyond a sample size of one. :)
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
346 * 8/6 (assuming they just slap on 2 more cores) = 461mm2
315/461 = 68% area scaling

It doesn't sound super bad or anything...

Two 32nm Llano cores w/L2$ are the nearly the exact same mm^2 as one bulldozer module w/L2$.

You could easily make the case that you could take a zambezi, swap out each bulldozer module with 2 Llano cores while keeping the L2$ the same size and the L3$ the same, all same HT, IMC, etc, and the resultant octo-core chip would be the same size as Zambezi.
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
Two 32nm Llano cores w/L2$ are the nearly the exact same mm^2 as one bulldozer module w/L2$.

You could easily make the case that you could take a zambezi, swap out each bulldozer module with 2 Llano cores while keeping the L2$ the same size and the L3$ the same, all same HT, IMC, etc, and the resultant octo-core chip would be the same size as Zambezi.

That is interesting that two Llano cores are the same size as a Bulldozer module. (Yet, the old Llano design is actually faster and more efficient!)

As far as the rest of the die size goes,

This xbit article suggests "automated tools" may a cause for excessive xtors:

Ex-AMD Engineer Explains Bulldozer Fiasco: Lack of Fine Tuning.

Engineer: AMD Should Have Hand-Crafted Bulldozer to Ensure High Speed
[10/13/2011 11:21 PM]
by Anton Shilov

Performance that Advanced Micro Devices' eight-core processor demonstrated in real-world applications is far from impressive as the chip barely outperforms competing quad-core central processing units from Intel. The reason why performance of the long-awaited Bulldozer was below expectations is not only because it was late, but because AMD had adopted design techniques that did not allow it tweak performance, according to an ex-AMD engineer.

Cliff A. Maier, an AMD engineer who left the company several years ago, the chip designer decided to abandon practice of hand-crafting various performance-critical parts of its chips and rely completely on automatic tools. While usage of tools that automatically implement certain technologies into silicon speeds up the design process, they cannot ensure maximum performance and efficiency.
Automated Design = 20% Bigger, 20% Slower

"The management decided there should be such cross-engineering [between AMD and ATI teams within the company] ,which meant we had to stop hand-crafting our CPU designs and switch to an SoC design style. This results in giving up a lot of performance, chip area, and efficiency. The reason DEC Alphas were always much faster than anything else is they designed each transistor by hand. Intel and AMD had always done so at least for the critical parts of the chip. That changed before I left - they started to rely on synthesis tools, automatic place and route tools, etc.," said Mr. Maier in a forum post noticed by Insideris.com web-site.

Apparently, automatically-generated designs are 20% bigger and 20% slower than hand-crafted designs, which results in increased transistor count, die space, cost and power efficiency.

"I had been in charge of our design flow in the years before I left, and I had tested these tools by asking the companies who sold them to design blocks (adders, multipliers, etc.) using their tools. I let them take as long as they wanted. They always came back to me with designs that were 20% bigger, and 20% slower than our hand-crafted designs, and which suffered from electro-migration and other problems," the former AMD engineer said.
Inefficiencies in Design?

While it is unknown whether AMD used automatic design flow tools for everything, there are certain facts that point to some inefficient pieces of design within Bulldozer. Officially, AMD claims that the Zambezi/Orochi processor consists of around 2 billion transistors, which is a very large number.

AMD publicly said that each Bulldozer dual-core CPU module with 2MB unified L2 cache contains 213 million transistors and is 30.9mm2 large. By contrast, die size of one processing engine of Llano processor (11-layer 32nm SOI, K10.5+ micro-architecture) is 9.69mm2 (without L2 cache), which indicates that AMD has succeeded in minimizing elements of its new micro-architecture so to maintain small size and production cost of the novelty.

As a result, all four CPU modules with L2 cache within Zambezi/Orochi processor consist of 852 million of transistors and take 123.6mm2 of die space. Assuming that 8MB of L3 cache (6 bits per cell) consist of 405 million of transistors, it leaves around whopping 800 million of transistors to various input/output interfaces, dual-channel DDR3 memory controller as well as various logic and routing inside the chip.

800 million of transistors - which take up a lot of die space - in an incredibly high number for various I/O, memory, logic, etc. For example, Intel's Core i-series "Sandy Bridge" quad-core chip with integrated graphics consists of 995 million.

While it cannot be confirmed, but it looks like AMD Orochi/Zambezi has several hundreds of millions of transistors that are a result of heavy reliance onto automated design tools.
The Result? Profit Drop!

As a consequence of inefficient design and relatively low performance, AMD has to sell its eight-core FX series processors (315mm2 die size) for up to $245 in 1000-unit quantities. By contrast, Intel sells hand-crafted Core i-series "Sandy Bridge" quad-core chips (216mm2 die size) for up to $317 in 1000-unit quantities. Given the fact that both microprocessors are made using 32nm process technology [and thus have comparable per-transistor/per square mm die cost], the Intel one carries much better profit margin than AMD's microprocessor.
 
Last edited:

TuxDave

Lifer
Oct 8, 2002
10,571
3
71
Two 32nm Llano cores w/L2$ are the nearly the exact same mm^2 as one bulldozer module w/L2$.

You could easily make the case that you could take a zambezi, swap out each bulldozer module with 2 Llano cores while keeping the L2$ the same size and the L3$ the same, all same HT, IMC, etc, and the resultant octo-core chip would be the same size as Zambezi.

So 8 llano cores = 8 bulldozer cores in area? Sounds about right.
 

Rvenger

Elite Member <br> Super Moderator <br> Video Cards
Apr 6, 2004
6,283
5
81
Are they even still employed by AMD? I know they cleaned house quite a bit in the mean time, right?
 

Vesku

Diamond Member
Aug 25, 2005
3,743
28
86
Isn't the uncore on Bulldozer quite a hefty portion of the die? What portion of die space would you need to make 8 Llano cores communicate well? How much larger would they need to be to support AVX and FMA?
 

Rifter

Lifer
Oct 9, 1999
11,522
751
126
More marketing crapspeak.

If they want to regain any repect they need to own up to their lies, and reduce pricing to compete with SB(ie top BD sku priced with 2500k)
 

bradley

Diamond Member
Jan 9, 2000
3,671
2
81
This is a great question, "Why would I buy a $275 Bulldozer cpu when the $170 1090t seems to equal its performance or actually do better at every benchmark and game we've seen?" :)

Though the more I look at Bulldozer, it's not that bad for certain applications. With all the Intel segmentation, I still think AMD gives the best bang for the buck.

Either you: accept Intel holding features hostage amongst their platforms, & AMD&#8217;s latest underperforming based on expectations, or you accept neither. The 2500k might not exist if AMD posed a greater threat; the FX-8150 almost needs to exist and prosper for AMD to survive.

What I do not appreciate, however, is being lied to about any progress -- whether as a consumer or an investor. Therefore, it&#8217;s nice to see AMD attempting a more straightforward approach and answering enthusiast&#8217;s questions. Even though it will take far more for AMD to regain enthusiast's trust.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
So 8 llano cores = 8 bulldozer cores in area? Sounds about right.

Weird coincidence though, right? For all the change-ups they did to the microarchitecture they ended up with six of one and a half dozen of the other.
 

-Slacker-

Golden Member
Feb 24, 2010
1,563
0
76
Sigh

7. Why would I buy a $275 Bulldozer cpu when the $170 1090t seems to equal its performance or actually do better at every benchmark and game we've seen?






Adam Kozak, Product Marketing Manager, AMD - We understand our customers make purchase decisions based on how they use their PCs, and in many cases our AMD Phenom&#8482; II processors are a great (purchase).


This is insulting. Not a great way to go about things, especially since the questions were asked by savvy enthusiasts, who can smell bull&#37;&#@ from a mile away - it's not like giving a more humble, honest and informative answer would have put off the rest of the market, since the rest of the market doesn't concern itself with what goes on on tech forums, and would not have been aware of any pessimistic admission by AMD...


I would have answered that question much better, even without portraying the product in any sort of negative way.
 

frostedflakes

Diamond Member
Mar 1, 2005
7,925
1
81
We are also working with Microsoft on a scheduler update for Windows 7 that will be available soon.
Maybe this is common knowledge, I haven't really been keeping up with all the latest Bulldozer drama and so this is the first I've heard about it. But this is good news, nice to know that people will not have to wait for Windows 8 to get the scheduler improvements with BD.

The rest of it doesn't seem to be anything we didn't already know or guess, though.
 

Concillian

Diamond Member
May 26, 2004
3,751
8
81
Except we saw that while less unattactive there, it is still unattractive. They also claim power-efficiency. So, I think they are just smoking crack.

In the server benches it was actually competing okay with the Xeons in terms of performance per watt with some software. I think there is some merit to the marketing drivel in that respect, BUT:
1) discussion was desktop chip
2) Desktop configurations (high clockspeeds to increase single threaded performance) are much worse for power performance compared to servers, where wide and slow is a good option.

This has been CLEARLY a server chip they're trying to dump into the desktop market since well before it's launch. That's fine, they'll hook a few under educated consumers and make profit off them. All the hardware MFRs do that, they just often do it in the lower end of their product lines (Celeron, low end 1GB+ 64 bit memory video cards, etc...)

Let the suckers pay extra for their garbage, it helps subsidize low prices for those of us who know better.
 

toyota

Lifer
Apr 15, 2001
12,957
1
0
so people had to wait a ridiculous amount of time to get Bulldozer. now they have to wait for the os and apps to be better suited for it? basically its always a waiting game as for as their cpus are concerned. I just scratch my a head when I see ignorant people actually "upgrade" to this cpu from a Phenom X6 or X4 especially for gaming.
 
Last edited:

LoneNinja

Senior member
Jan 5, 2009
825
0
0
In that I see this.

They give almost no answers, claim it's designed for energy efficiency, yet it's not. And tell us how future operating systems and software will allow it to perform better while they should be able to ramp up clock speeds.

Basically it is a turd that they want to polish and convince us to buy in hopes that tomorrow it performs better while Ivy Bridge will only further the performance gap between Intel/AMD. LMAO

I suppose the FX 8120 is some what useful for very limited work loads, but it isn't a price/performance winner.