Some Bulldozer and Bobcat articles have sprung up

Page 13 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Kuzi

Senior member
Sep 16, 2007
572
0
0
Of course none of this matters either.

What matters is all those things we don't know, like price, consumption and performance.

For sure. And die size/yield will be important factors too, how big will a 4-Module Bulldozer be? What about 4-Core SandyBridge?
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
The die size of 4 core Sandy Bridge has been known since the first leak. It's generally accepted the mainstream, 4 core Sandy Bridge that has the full 12 EU integrated has a die size of 22xmm2.
 

bryanW1995

Lifer
May 22, 2007
11,144
32
91
I'm glad I'm not an insider so can speculate all I want. I'll bet sometimes it's hard to keep quiet for you guys.
 

Scali

Banned
Dec 3, 2004
2,495
0
0
How about less execution units running at twice the speed, similar to the P4? Intel did that like 5 or 6 years ago, can AMD do it?

Pentium 4 didn't exactly excel at IPC. In fact, it had the worst IPC since the Pentium Classic.
But Pentium 4 had incredible clockspeeds for the time.
From what I read, BD will run at speeds of about 2.75 GHz. So that will be in the same ballpark as current Intel and AMD processors, unlike the Pentium 4, which had up to a GHz clockspeed advantage over competing architectures.

For the rest, Idontcare seems to have covered that nicely already.

However, since we're on the subject of Pentium 4... there's another issue I'd like to point out.
Part of the Pentium 4's problems with performance were because software was generally optimized for Pentium or Pentium Pro architectures. Pentium 4 had a very different architecture, and required very different types of optimizations. So a lot of real-world applications didn't run as well as they could on Pentium 4.
Bulldozer could suffer that same problem. For the past 10+ years, software has been optimized for CPUs with 3 ALUs. It could be that this works against BD, just like Pentium 4.
Let's be realistic here... if Pentium 4 wasn't 'important' enough for all developers to recompile their code with Pentium 4-optimizations, BD doesn't stand much of a chance (aside from the fact that unlike Intel, AMD doesn't offer an optimizing compiler in the first place).
 
Last edited:

Scali

Banned
Dec 3, 2004
2,495
0
0
no comment on the "real world" analogy?

Well no, I think 30% is a good real-world average (note that this is going from 1 to 2 threads... if you try to measure it over an entire multi-core chip, your measurements will be skewed a lot because of inherent scaling problems, memory bottlenecks and that sort of thing... even with real cores, scaling generally doesn't go linearly, but more like an exponential drop-off. That holds true for HT, BD as well as purely physical cores).
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Words are cheap, facts are what matters and the fact is you cannot compare a Deneb Core with a Bulldozer Module the way you compared them.

Of course none of this matters either.

What matters is all those things we don't know, like price, consumption and performance.

For sure. And die size/yield will be important factors too, how big will a 4-Module Bulldozer be? What about 4-Core SandyBridge?

I remember when Nehalem came out that I created a bunch of needless angst for myself by not clearly denoting in my posts those occasions where I cared about performance for the sake of performance as a consumer versus those occasions where I cared about performance for the sake of critiquing/understanding design decisions as a geeky arm-chair circuit engineer.

I could be wrong about this, but my perception here is that we have a lot of both going on in this thread and it is the source of some of the misunderstandings.

Some folks are curious to make performance comparisons for really no other reason than to critique the architecture decisions and to better understand the trade-offs that were made irrespective of how the final product actually performs and competes. Others among us are looking at all these BD numbers and speculations with an eye for understanding how the chip will compete with Intel when it comes to performance/$ or performance/W.

All sides are relevant, but you got to make sure you know which perspective your counterpart in the discussion is thinking they are talking about, otherwise your message will seem to contradict other posts.

Just my opinion, Inteluser and I went round and round a while on Nehalem until we realized we were just talking past each other on different subjects, we actually had a lot we agreed upon without even realizing it at the time.
 

Kuzi

Senior member
Sep 16, 2007
572
0
0
I remember when Nehalem came out that I created a bunch of needless angst for myself by not clearly denoting in my posts those occasions where I cared about performance for the sake of performance as a consumer versus those occasions where I cared about performance for the sake of critiquing/understanding design decisions as a geeky arm-chair circuit engineer.

I could be wrong about this, but my perception here is that we have a lot of both going on in this thread and it is the source of some of the misunderstandings.

Some years back while studying towards my Computer Engineering degree, I found that Digital Circuits, Computer Architecture, and Computer Hardware in general were subjects of great interest to me, that's why I am always thrilled and grateful to be exchanging knowledge and learning from insightful people such as yourself and a bunch of others on these forums and other sites around the net.

As a computer enthusiast, performance is very important to me, besides that I'd like to understand/comprehend the discussed architecture and realize what type of strengths/weaknesses it has.

Some folks are curious to make performance comparisons for really no other reason than to critique the architecture decisions and to better understand the trade-offs that were made irrespective of how the final product actually performs and competes. Others among us are looking at all these BD numbers and speculations with an eye for understanding how the chip will compete with Intel when it comes to performance/$ or performance/W.

In my case, I look at what Intel did in starting with an impressive architecture as a "base" for future designs, that base was Core/Core2 and they improved from there. AMD is taking a more radical approach with BD, it's actually a risk for them, especially considering they are the underdog and they are behind on pure performance and performance/watt.

That's why I find BD more interesting as an architecture, because we already know Intel is ahead, but we want to see can AMD catch up, or at least close the gap in some area. One area I'm pretty positive about, is that BD may improve performance/watt tremendously compared to current K10.5 designs.

All sides are relevant, but you got to make sure you know which perspective your counterpart in the discussion is thinking they are talking about, otherwise your message will seem to contradict other posts.

This is more likely to happen in my case, because English is a second language for me and I can't explain my thoughts as well as I should. That's why in most cases I prefer to lurk but some subjects/threads pull me in like a magnet, I feel compelled to post something there :)
 
Last edited:

GaiaHunter

Diamond Member
Jul 13, 2008
3,700
406
126
In my case, I look at what Intel did in starting with an impressive architecture as a "base" for future designs, that base was Core/Core2 and they improved from there. AMD is taking a more radical approach with BD, it's actually a risk for them, especially considering they are the underdog and they are behind on pure performance and performance/watt.

That's why I find BD more interesting as an architecture, because we already know Intel is ahead, but we want to see can AMD catch up, or at least close the gap in some area. One area I'm pretty positive about, is that BD may improve performance/watt tremendously compared to current K10.5 designs.



This is more likely to happen in my case, because English is a second language for me and I can't explain my thoughts as well as I should. That's why in most cases I prefer to lurk but some subjects/threads pull me in like a magnet, I feel compelled to post something there :)

Agreed on both accounts.

I find BD more interesting because it seems a high move risk in some areas compared to what we know it works for Intel. On the other hand AMD isn't Intel so it does have to play the game in a different way or Intel will just beat it on experience and simply massive amounts of resources.

It will be easy to bash AMD if BD choices turn to be all wrong and it will be surprising if AMD does have a great product.

Guess we have to respect the attitude to take a risk and try to lead a new path when you are the smaller player.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Damn you and the elephant. I was halfway through my "biggest bulldozer misconceptions" blog and I started it with the elephant story. You beat me to it.

heh, what can I say, great minds think alike! :D

In all seriousness though, its not like I invented the concept, and your blog is likely to reach a >1000x more readers than any post of mine so if the shoe fits then I would certainly hope you use it for your blog.

I just feel sorry for that elephant, enduring all that poking and prodding and for what? Just to listen to the blind guys squabble about them afterwards? How droll. :p
 

bryanW1995

Lifer
May 22, 2007
11,144
32
91
yes, but if bulldozer ends up being the shit then you can at least come back and say "I told you so". and if it's not, then you will already know that so you'll be able to find cover in plenty of time. either way, you win!
 

Scali

Banned
Dec 3, 2004
2,495
0
0
It will be easy to bash AMD if BD choices turn to be all wrong and it will be surprising if AMD does have a great product.

I think the problem here is about what people think BD is, and what it isn't.
"A great product" can mean many things to many people.
BD is simply a trade-off between single-threaded and multi-threaded performance. Something very common in the server world (best example: Sun Niagara architecture).
So BD is not going to be the single-threaded performance king... Does that make it a bad product? No, because that's just not what BD is about.
As long as it gives you enough cores with enough performance for a decent price and power envelope (which is *exactly* what AMD is concentrating on), BD can be a great product. It just targets something other than the current Intel lineup.

I think the AMD camp needs to come to grips with the fact that AMD is no longer the gaming CPU of choice. AMD has been leaning towards the server market for a long time now, and since Athlon64 they haven't had the advantage in gaming anymore. They also have not actively been pursuing that anymore (sharing the FPU/SSE unit with two modules isn't exactly going to improve CPU-based physics, or photo/video editing/encoding, for example).
BD isn't aiming at single-threaded performance, so it will likely not be the best gaming choice either. It can make for an excellent server/workstation product though.
It's just something that most people here probably aren't that interested in.

So don't get all worked up when someone says it's not going to have class-leading IPC (which it won't) or that it's not going to be the best choice for gaming (which it won't). That doesn't mean it won't be a great product. If you think that's what it means, you need to broaden your horizon.

I think that BD can be a great product, although I don't think it will be the best performer. I don't think AMD can withstand Sandy Bridge's bruteforce attack. However, I think that although the absolute performance won't be there, AMD will probably be able to have competitive performance/watt, and they WILL have the better pricing.
 
Last edited:

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
I think the AMD camp needs to come to grips with the fact that AMD is no longer the gaming CPU of choice. AMD has been leaning towards the server market for a long time now, and since Athlon64 they haven't had the advantage in gaming anymore. They also have not actively been pursuing that anymore (sharing the FPU/SSE unit with two modules isn't exactly going to improve CPU-based physics, or photo/video editing/encoding, for example).
BD isn't aiming at single-threaded performance, so it will likely not be the best gaming choice either.
Given that games are using more threads, lately, BD's FPU should be at least as good as a PII's, and Intel still hasn't bothered to compete at low prices (I'm sure they will when SB hits, though)...I'm inclined to disagree.

AMD has great gaming CPUs at this very instant, if you your budget won't easily fit $300+ or so for a CPU and mobo. They don't need better performance than Intel. They need to be competitive up to Intel's midrange, in desktop, servers, and notebooks, and make money at it, while branching out into other markets (netbooks, tablets, low-performance servers, embedded x86, what-have-you). Despite their poorer performance, making money selling CPUs has been a much bigger problem*. If things like BD can give them an edge, in terms of margins, without sacrificing even the low/mid to Intel, that would be a very good move for AMD, and not put them in a bad place, wrt to desktops.

* I realize the two are dependent, but as long as Intel remains worried that AMD might have chance at besting them, AMD has no way to 'fix' the poorer single-thread performance.
 

Scali

Banned
Dec 3, 2004
2,495
0
0
Given that games are using more threads, lately, BD's FPU should be at least as good as a PII's, and Intel still hasn't bothered to compete at low prices (I'm sure they will when SB hits, though)...I'm inclined to disagree.

The FPU may be at least as good as a PII's, problem is, it is now shared by two cores... so it takes a hit in multithreaded environments.
So we have two 'weaknesses' here, one being single-threaded performance, the other being FPU performance in multithreaded environments. Two of the pillars that gaming performance is built upon.
By the way, Intel just halved the price of the Core i7 950.

AMD has great gaming CPUs at this very instant, if you your budget won't easily fit $300+ or so for a CPU and mobo.

You mean they are 'good enough', as long as your standards aren't that high.
The *best* gaming CPUs come from Intel. Especially if we get into multi-GPU configurations. I hate the AMD camp always bringing budget into the equation, when clearly I was discussing performance here.
 

ViRGE

Elite Member, Moderator Emeritus
Oct 9, 1999
31,516
167
106
The FPU may be at least as good as a PII's, problem is, it is now shared by two cores... so it takes a hit in multithreaded environments.
So we have two 'weaknesses' here, one being single-threaded performance, the other being FPU performance in multithreaded environments. Two of the pillars that gaming performance is built upon.
Well do we have confirmation on what the retail Bulldozer products are going to be like? My impression is that BD parts are going to have as many modules as existing Phenoms have cores, so there would be just as many FPUs as on Phenom. In which case there shouldn't be any kind of FPU weakness.
 

Scali

Banned
Dec 3, 2004
2,495
0
0
Well do we have confirmation on what the retail Bulldozer products are going to be like? My impression is that BD parts are going to have as many modules as existing Phenoms have cores, so there would be just as many FPUs as on Phenom. In which case there shouldn't be any kind of FPU weakness.

I believe the 4-module is going to be top-dog (codename Zambezi)... so that's not as many modules as the Phenom II X6 has cores.
In which case you'd be working with only 4 FPUs, rather than 6.

If there will be a 6-module BD, it will likely have a considerably higher transistor count, so we'll have to see how that pans out in terms of price, clockspeed scaling, TDP and all that.
 

Cogman

Lifer
Sep 19, 2000
10,286
145
106
The FPU may be at least as good as a PII's, problem is, it is now shared by two cores... so it takes a hit in multithreaded environments.
So we have two 'weaknesses' here, one being single-threaded performance, the other being FPU performance in multithreaded environments. Two of the pillars that gaming performance is built upon.
By the way, Intel just halved the price of the Core i7 950..

While I agree with you that most likely AMD will suffer in multithreaded FPU operations, this is really just speculation at this point. We have no idea how good or bad AMDs FPU will be until someone runs some benchmarks on it. It is, after all, a brand new architecture. Those often have a tendency not to perform the way we expect them to.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
begone with your logically steadfast refusal to condemn that which you have yet to study! I'll have none of it, none of it I say, ya hear!
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,362
136
I think the AMD camp needs to come to grips with the fact that AMD is no longer the gaming CPU of choice

I believe this is true if you play at low resolutions below 1680x1050. At large resolutions (1920x1080 and up) with AA + AF on, most of the time the VGA plays the first role and both companies High End CPUs like Phenom X6 and Core i7 are equal.

The FPU may be at least as good as a PII's, problem is, it is now shared by two cores... so it takes a hit in multithreaded environments.

We know Bulldozer FP unit (2 x FMACs) will do 2 x 128-bit so where’s the problem with multithreads???
 

Scali

Banned
Dec 3, 2004
2,495
0
0
While I agree with you that most likely AMD will suffer in multithreaded FPU operations, this is really just speculation at this point. We have no idea how good or bad AMDs FPU will be until someone runs some benchmarks on it. It is, after all, a brand new architecture. Those often have a tendency not to perform the way we expect them to.

It's all an educated guess, based on the gains that Intel and AMD have historically managed to get from new micro-architectures.
If we take the 4:6 FPU ratio, then we'd have to conclude that the FPUs have to perform 50% better in order to break even with a current 6-core Phenom.
Historically, 50% improvement in performance rarely occured (maybe 486 -> Pentium... but otherwise? Can't think of any).
K10 barely improved IPC over K8... perhaps just 5% or so.
Nehalem over Conroe? Also not that spectacular in IPC... what was it, 20% perhaps?
Sandy Brigde over Nehalem? Apparently about 15% according to Anand's preview.

So... well, theoretically there still is a possibility that AMD's new architecture can get us the 50% improvement that is required to break-even with Phenom X6 (and that is ignoring the REAL competition from Intel, which is WELL above X6 performance levels).
But they'd need a lot of pixie dust and magic unicorns to get there. I've seen no evidence of such in the information they disclosed on the architecture so far.
It's not 'just speculation'. By extrapolating from the information that AMD released, and other empirical data from Intel and AMD, I think the probability can be reduced to near-zero.
 

Scali

Banned
Dec 3, 2004
2,495
0
0
We know Bulldozer FP unit (2 x FMACs) will do 2 x 128-bit so where’s the problem with multithreads???

Exactly there:
Phenom II can also do 2x128 bit per core. That is 4x128 bit per 2 cores/threads, vs 2x128 bit per module/2 cores/threads for BD.
See: http://en.wikipedia.org/wiki/File:AMD_K10_Arch.svg
Note that there are 3 FPU execution ports, two capable of 128-bit SIMD, and one additional FMISC, which also isn't present on BD.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
It's all an educated guess, based on the gains that Intel and AMD have historically managed to get from new micro-architectures.
If we take the 4:6 FPU ratio, then we'd have to conclude that the FPUs have to perform 50% better in order to break even with a current 6-core Phenom.
Historically, 50% improvement in performance rarely occured (maybe 486 -> Pentium... but otherwise? Can't think of any).
K10 barely improved IPC over K8... perhaps just 5% or so.
Nehalem over Conroe? Also not that spectacular in IPC... what was it, 20% perhaps?
Sandy Brigde over Nehalem? Apparently about 15% according to Anand's preview.

Nehalem was 20%, but most was due to Hyperthreading. It would get big gains like 30-40% with multi-threaded apps, but then single thread apps would get maybe 10%.

I think you are blurring the line between "per execution unit" performance and "Instructions Per Cycle". Apps that use lot of FP code that aren't bandwidth bound by whatever memory interface they'll have with Bulldozer will get closer to 50% than those that aren't.

Big gains with multi-threading is probably possible as well. While having less execution units might lose per core performance, the gain won't be anywhere near linear because that's not the only thing that determines chip performance. While at worst they might lose 50% due to less execution resources, the theoretical gain by having double the amount will be closer to 2x. At the worst case, they can do 33%(2x/1.5x).
 

JFAMD

Senior member
May 16, 2009
565
0
0
While I agree with you that most likely AMD will suffer in multithreaded FPU operations, this is really just speculation at this point. We have no idea how good or bad AMDs FPU will be until someone runs some benchmarks on it. It is, after all, a brand new architecture. Those often have a tendency not to perform the way we expect them to.

Actually in a multithreade FPU environment, we would have an advantage.

In AVX we will have 8 256-bit units
In non-AVX we will have 16 128-bit units.

Compared to everything I have seen on the server Sandybridge, they will have 8 256-bit AVX units, so we are generally tied on AVX code, but on non-AVX code they will only have 8 128-bit units, or half the FP capability.

Remember that most apps will not be recompiled to take advantage of AVX right away, so we have an advantage.

Also, unless they have changed their scheduler, they have 1 that covers 2 integer threads and the FPU. We have one for each integer thread plus one for the FPU, so in a multithreaded environement I would bet on Bulldozer.
 

Scali

Banned
Dec 3, 2004
2,495
0
0
I think you are blurring the line between "per execution unit" performance and "Instructions Per Cycle".

I don't think I do, can you elaborate as to what makes you think that?

Apps that use lot of FP code that aren't bandwidth bound by whatever memory interface they'll have with Bulldozer will get closer to 50% than those that aren't.

This seems to go by the assumption that they will be able to GET that 50% in the first place, best case.
My point is that it is highly unlikely that they can improve efficiency enough (50%) to compensate for the lower number of execution units.

Big gains with multi-threading is probably possible as well.

Yes, but I was mainly talking about single-threaded performance and gaming.
While games are more multi-threaded these days, they aren't exactly on the best terms with Amdahl's Law, if you know what I mean.
This is painfully demonstrated by Phenom II X6, whose two extra cores cannot compensate for Intel's 4 faster cores, even without HT (such as the i5 760).
In multithreaded applications that DO scale well, X6 is faster, but games aren't among those.