Can we start calling Bulldozer a 4 core CPU?

Page 3 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

LOL_Wut_Axel

Diamond Member
Mar 26, 2011
4,310
8
81
If AMD called BD a quad core, then at least they can say their quad is somewhat close to Intel's quad. As it stands now, their 8 core is getting beat by Intel's quad.

But I agree with everyone else here. This is more of an 8 core than it is a quad core. But AMD can call it whatever they like. It doesn't change the performance.

The problem for Bulldozer is that it's a too forward-thinking architecture that could've been made a lot more efficient and smaller. It's an architecture that would've made sense in workloads perhaps five years from now, and even if everything was multi-threaded today it'd still have the problem of horrible performance/watt. It has eight WEAK integer cores and can only keep up with a four integer core/eight thread CPU from Intel. Cost of manufacturing would be much higher for AMD, too, and Intel could easily lower prices while keeping high margins--AMD couldn't.
 

GrumpyMan

Diamond Member
May 14, 2001
5,780
266
136
You gave up too quickly! :D

...if you were to remove that piece of silicon, grind off a few microns of copper wire mesh and carbon-doped glass insulator, you'd be looking at something like this:
bulldozer-die.jpg

http://hothardware.com/Reviews/AMD-FX8150-8Core-Processor-Review-Bulldozer-Has-Landed/?page=2

(you'll have to write the letters on it yourself though :p)



Very interesting, so the Modules are what is considered a "core" so to speak? And all the fancy "doodads and whatchamacallits" around it make it behave like an 8 "core"?
Kind of like a six cylinder engine with Turbos for more air intake and cylinders that have been bored out make an engine have twice the horsepower of an 8 cylinder, so to speak, with more giddy up and go? :confused:
I hope I haven't gotten too technical for ya.....:)
 

jhu

Lifer
Oct 10, 1999
11,918
9
81
It's fun to play armchair processor designer/chip company/etc., but I think AMD should have just designed a chip specifically for server workloads.

Look at, for example, SPARC T4: 8 cores, 8 threads per core (64 threads total!), and it performs on par or better than it's competitors using less resources. I think AMD should have designed something similar and relegate it to servers, then either add two more cores to their Thuban design or connect two Stars dies into an MCM and sell those for the rest of us.
 
Last edited:

AtenRa

Lifer
Feb 2, 2009
14,003
3,362
136
I see a lot of "It's 8 core!" claims but not once has anyone reconciled how they can call it 8 core when executing a second thread (or even enabling the capability to execute a single thread) causes such a performance hit to the first thread. If it's 8 core, what's running on one or the other makes no difference. It will not slow down the other "core". (ignoring if you starve the IO)

In practice is does. It does not perform like 8 discrete cores.

Where did you see that ??
There is no hit in single thread performance but, we do get an 80% of a real dual core (CMP).

we don't count cores with performance figures :p
 

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
The problem for Bulldozer is that it's a too forward-thinking architecture that could've been made a lot more efficient and smaller. It's an architecture that would've made sense in workloads perhaps five years from now, and even if everything was multi-threaded today it'd still have the problem of horrible performance/watt.
No, it wouldn't be good. Even today, there are enough programs to test that use enough threads to give most or all of BD's cores a workout. Each core has the capability to do a good bit more every cycle than a Phenom II, yet application performance is only about as good as Stars--worse as often as better, and the better cases not being enough to offset the poorer cases.

Five years from now, do you think Intel is going to be sitting on their current core counts? Do you think Intel will let IPC and performance per Watt regress? I don't. Five years from now, a CPU like BD will just look worse than it does, today. Let's face it: AMD has clearly screwed up the first round.

The forward-looking CMT/SMT bit seems to work pretty well. It would most certainly look about as bad as pure CMP, or maybe even worse.

It has eight WEAK integer cores and can only keep up with a four integer core/eight thread CPU from Intel.
Each of its int cores should be plenty capable of performing faster than Stars. They appear to be weak, but they shouldn't be weak. Somewhere in there is where BD's lack of luster resides. 2 ALU, 2 AGU, unified OOO scheduling (v. schedulers per unit for K7-Stars), aggressive prefetchers that, by all we saw before launch, aught to rival Intel's, branch prediction decoupling (an Achilles heel of K8-Stars), a 128-bit vector FP unit per core for 99% of usage, and more than double the xtors of PhII X6...yet, even when it bests the Phenom II, it doesn't do it by much, and uses plenty of power in the process.

It's not too forward-looking. It's Barcelona 2.0.
 

exar333

Diamond Member
Feb 7, 2004
8,518
8
91
I work in an oracle shop. I write j2ee code. We write j2ee because we're migrating from mainframe to oracle db. Let me be clear. Oracle licenses are nothing compared to mainframe licenses. Your point is very knee-jerkish.

Furthermore, Java, specifically OpenJDK, should in the future have a lot more multi-threaded jvm performance than the vanilla-free jdk you can get today without having to buy a licensed jvm that has all the bell's and whistles.

Even if it didn't, having 8 cores today on x86 or 16 on a 2-way ... 32 on a 4way tomorrow, means that we could retire the cool threads servers down stairs along with the Sparc IIIi's that still are in production use. The BD's could even run OpenIndiana to retain most of the look and feel. I guarentee you a BD uses less watts for 8 threads than I can get out of a dual cpu Sparc IIIi server.
Oracle may be big and they may be dicks, but at the end of the day I go home and play video games and watch TV. If Oracle makes my life easier, or microsoft, or Apple... it gets me home faster so let them fight it out and license their tech how they want.

8-core BD uses the same power as 2x4C/8T SB CPUs. I can tell you which one I would rather use...
 

EightySix Four

Diamond Member
Jul 17, 2004
5,122
52
91
Very interesting, so the Modules are what is considered a "core" so to speak? And all the fancy "doodads and whatchamacallits" around it make it behave like an 8 "core"?
Kind of like a six cylinder engine with Turbos for more air intake and cylinders that have been bored out make an engine have twice the horsepower of an 8 cylinder, so to speak, with more giddy up and go? :confused:
I hope I haven't gotten too technical for ya.....:)

Nope, each module is a single V (2 cylinders) in a V8 engine (2 int cores per module). It's a OHV engine though, with a pair of pushrods per cylinder (2x128bit capable FPU). Unfortunately, the fuel injectors and throttle body are way too small and unable to feed the motor, limiting its performance in normal driving situations.
 

Janooo

Golden Member
Aug 22, 2005
1,067
13
81
I work in an oracle shop. I write j2ee code. We write j2ee because we're migrating from mainframe to oracle db. Let me be clear. Oracle licenses are nothing compared to mainframe licenses. Your point is very knee-jerkish.

Furthermore, Java, specifically OpenJDK, should in the future have a lot more multi-threaded jvm performance than the vanilla-free jdk you can get today without having to buy a licensed jvm that has all the bell's and whistles.

Even if it didn't, having 8 cores today on x86 or 16 on a 2-way ... 32 on a 4way tomorrow, means that we could retire the cool threads servers down stairs along with the Sparc IIIi's that still are in production use. The BD's could even run OpenIndiana to retain most of the look and feel. I guarentee you a BD uses less watts for 8 threads than I can get out of a dual cpu Sparc IIIi server.

Oracle may be big and they may be dicks, but at the end of the day I go home and play video games and watch TV. If Oracle makes my life easier, or microsoft, or Apple... it gets me home faster so let them fight it out and license their tech how they want.

Well, Oracle Enterprise Edition on 8 core BD is $190K.
4 core SB is $95K.
 

Munky

Diamond Member
Feb 5, 2005
9,372
0
76
I'm calling it an 8-core because it can simultaneously execute 8 threads using 8 execution units.
 

GrumpyMan

Diamond Member
May 14, 2001
5,780
266
136
Nope, each module is a single V (2 cylinders) in a V8 engine (2 int cores per module). It's a OHV engine though, with a pair of pushrods per cylinder (2x128bit capable FPU). Unfortunately, the fuel injectors and throttle body are way too small and unable to feed the motor, limiting its performance in normal driving situations.

I see, a good explanation for the layman. Thanks!
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
Somewhere in there is where BD's lack of luster resides. 2 ALU, 2 AGU, unified OOO scheduling (v. schedulers per unit for K7-Stars), aggressive prefetchers that, by all we saw before launch, aught to rival Intel's, branch prediction decoupling (an Achilles heel of K8-Stars), a 128-bit vector FP unit per core for 99% of usage, and more than double the xtors of PhII X6...yet, even when it bests the Phenom II, it doesn't do it by much, and uses plenty of power in the process.

No it wasn't. The biggest problem with Bulldozer is not how it performs, but rather how everyone who weren't AMD hyped the chip beyond the stars. AMD certainly didn't help by doing much to quell the rumors, but that's different.

Speaking of so many enthusiast forums wanting "innovation" and "something different". I guess we all like crazy in our lives.
 

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
No it wasn't. The biggest problem with Bulldozer is not how it performs, but rather how everyone who weren't AMD hyped the chip beyond the stars. AMD certainly didn't help by doing much to quell the rumors, but that's different.
Since when has JFAMD been not of AMD? I don't want to make a call-out, really, but that IPC would improve, and performance per watt would be good, came very much from AMD (and, I expect they thought it would be a bit better, too, unless it does have some kind of server pixie dust). There's also the 80%/50% and 50%/33% performance numbers bandied about in presentations for ages, which look more like 33%/35%, now. Crazy rumours like beating SB and such came from out of AMD, sure.
 
Last edited:

Ferzerp

Diamond Member
Oct 12, 1999
6,438
107
106
Where did you see that ??
There is no hit in single thread performance but, we do get an 80% of a real dual core (CMP).

we don't count cores with performance figures :p

There have been lots of links on here to benchmarks showing that a single thread on a module performs 10-15% better when the other half of that module is disabled. It's indicative of too much shared to call it 2 cores.
 

hooflung

Golden Member
Dec 31, 2004
1,190
1
0
Well, Oracle Enterprise Edition on 8 core BD is $190K.
4 core SB is $95K.


So what? You cannot get 8 threads of performance out of that SB. HT only goes so far. It cannot execute 2 threads at 1 time no matter how you slice it. It can put 8 in the que, but can only execute 4 at any one time.

So if I need 8 concurrent threads I pay for... 8 threads.

Moreover, my options of running workstation development is also cheaper since I can get a BD AM3 class chip to simulate a Xen environment with VT-D, IOMMU for less money than I can a Xeon environment. The only SB chips that support VT-D are the non K series. Therefore, I need 2 computers to run 8 threads.

Any way you cut it there are markets where AMD has built been built with clear brilliance. The unfortunate event of it all, which noone here is going to deny, is that the performance per watt is not exactly up to snuff. But the law of physics demand that will get better with the maturity of the manufacturing process.
 

hooflung

Golden Member
Dec 31, 2004
1,190
1
0
There have been lots of links on here to benchmarks showing that a single thread on a module performs 10-15% better when the other half of that module is disabled. It's indicative of too much shared to call it 2 cores.

No there isn't. You cannot disable the modules. The problem lies in the fact that the OS scheduler in Windows 7 cannot at this time put 2 threads on 1 module and then there is the turbo situation where you want to maximize how turbo kicks in. The two ideologies compete for performance at low clocks (if you take into account BD needs more mhz).

This has absolutely nothing to do with this chip being 8 cores or 4. Performance discrepencies have everything to do with how the chip overclocks itself sir.
 

Kenmitch

Diamond Member
Oct 10, 1999
8,505
2,250
136
I call it the beginning....Now what that is I don't know yet :)

Could be something great to come, could be the end, could be........you get the point.

Let's just go with whatever AMD is calling it until time tells us what it really is.
 

Tuna-Fish

Golden Member
Mar 4, 2011
1,672
2,546
136
So what? You cannot get 8 threads of performance out of that SB. HT only goes so far. It cannot execute 2 threads at 1 time no matter how you slice it. It can put 8 in the que, but can only execute 4 at any one time.

That's not how it works. Modern x86 cpu's have multiple independent execution units (SNB chips have 3 ALUs and 2 AGUs per core), and SB can, and normally does, execute instructions from both threads simultaneously, in different execution units. Ironically, the only part of the chip that cannot work on both of the threads at the same time is the frontend and decoders, so SB can execute 8 threads at any one time, but can only put instructions from 4 threads to the queue. (This isn't a problem because the frontend can pass on 4 macro-ops per clock, which is typically enough work for two clocks).

But the big point is not the specifics of the implementation. It's how much raw throughput the chip actually has. If the throughput stays the same, more threads is actually a bad thing. I'd much rather have a single really fast thread that I can time-share when I have multithreaded loads, than the same throughput in 2 half-speed threads where I have the same throughput at multithreaded but am just sol when it comes to non-partitionable loads.

BD is just a pitiful chip as presented. As Cerb put it earlier, it should be much faster than it is. After reading the RWT articles, looking at the chip on paper, it should perform better than it does. Hopefully, it's something simple enough that they can fix it.
 

ocre

Golden Member
Dec 26, 2008
1,594
7
81
Each of its int cores should be plenty capable of performing faster than Stars. They appear to be weak, but they shouldn't be weak. Somewhere in there is where BD's lack of luster resides. 2 ALU, 2 AGU, unified OOO scheduling (v. schedulers per unit for K7-Stars), aggressive prefetchers that, by all we saw before launch, aught to rival Intel's, branch prediction decoupling (an Achilles heel of K8-Stars), a 128-bit vector FP unit per core for 99% of usage, and more than double the xtors of PhII X6...yet, even when it bests the Phenom II, it doesn't do it by much, and uses plenty of power in the process.

It's not too forward-looking. It's Barcelona 2.0.

Amen!

Not many are seeing this. what about the flip flop performance inconsistencies? Everyone is saying they are just really really weak cores. Well there are times these same cores are more powerful than thurban. BD needs perfect code threading to keep the cores busy. Most every case BD looks bad are cases where the cores are just wasting energy. It is a big issue at times. AMDs huge cache and pipeline couldnt solve this. At times its only a big mess. Many wasted cycles.

Intel cores are fed a consistently. Intels cores look soo much more powerful but much of that appearance comes from they many yrs intels strived to improve performance just by eliminating wasted cycles. Its something intel took very very seriously, there was a lot to gain. Wasted cycles = wasted energy and intel's focus on this by itself has given them huge improvements in ipc.

lastly, Ferzerp

"No it wasn't. The biggest problem with Bulldozer is not how it performs, but rather how everyone who weren't AMD hyped the chip beyond the stars. AMD certainly didn't help by doing much to quell the rumors, but that's different."


Bulldozer??? the worlds fastest CPU???? These are the words AMD said. Countless others, but just these two are enough to prove my point. I mean come on, What comes to mind when you hear " bulldozer, the worlds fastest CPU, coming soon" No one hyped up BD more than AMD. They created this hype in hopes that their words would reach the unknowing public and spread so that they become associated with their product. Before anyone could contest it, they were spreading propaganda. Once the communities get over the shock and all die down, AMD hopes there will be enough laymen out there to grow their planted seed. Its a crazy plan. I think AMD really really are in trouble.

I said before the launch. BD will be a success if AMD market it clever. Its only because of their ,marketing tactics thus far that we see this huge uproar now. AMD shouldve been straight all along, and it wouldnt have felt soo bad. Like betrayal
 

BD231

Lifer
Feb 26, 2001
10,568
138
106
Very interesting, so the Modules are what is considered a "core" so to speak? And all the fancy "doodads and whatchamacallits" around it make it behave like an 8 "core"?
Kind of like a six cylinder engine with Turbos for more air intake and cylinders that have been bored out make an engine have twice the horsepower of an 8 cylinder, so to speak, with more giddy up and go? :confused:
I hope I haven't gotten too technical for ya.....:)

Bulldozer is the V4 rice rocket of cpus. Terrible design that cuts corners and comes up short no matter how u work it.
 

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
I said before the launch. BD will be a success if AMD market it clever. Its only because of their ,marketing tactics thus far that we see this huge uproar now. AMD shouldve been straight all along, and it wouldnt have felt soo bad. Like betrayal

On the other hand, there, how many in AMD knew exactly what the results would be? It seems like if they knew what they were making, they would have changed tactics a good ways back, instead of just shutting up before the late Summer launch date. Towards that end, there isn't a good answer, either way: one option, they were trying to hype a dud to get sales before it was a real product; the other, that their organization was not prepared for the needs of producing a high performance x86 CPU, today, and they genuinely thought they had a winner, even as they were testing the first units, that just needed a little tweaking before the real launch.

I also wonder if they have some major improvements in the pipe, since Trinity will suck if it needs even 75% of the power BD does per clock, when Intel will have IB out and mature, by then. There is a point where being willing to sell products cheaper isn't good enough, if they are sufficiently inferior (unless AMD wants to start selling at Harbor Freight :)). If the replacements to Llano suck, they will be hurting worse than usual.
 

Janooo

Golden Member
Aug 22, 2005
1,067
13
81
So what? You cannot get 8 threads of performance out of that SB. HT only goes so far. It cannot execute 2 threads at 1 time no matter how you slice it. It can put 8 in the que, but can only execute 4 at any one time.

So if I need 8 concurrent threads I pay for... 8 threads.

Moreover, my options of running workstation development is also cheaper since I can get a BD AM3 class chip to simulate a Xen environment with VT-D, IOMMU for less money than I can a Xeon environment. The only SB chips that support VT-D are the non K series. Therefore, I need 2 computers to run 8 threads.

Any way you cut it there are markets where AMD has built been built with clear brilliance. The unfortunate event of it all, which noone here is going to deny, is that the performance per watt is not exactly up to snuff. But the law of physics demand that will get better with the maturity of the manufacturing process.
So what? Just that nobody will buy BD for Oracle. That's it, and it was supposed to be server CPU.
They should have called it 4 core.
 

ocre

Golden Member
Dec 26, 2008
1,594
7
81
On the other hand, there, how many in AMD knew exactly what the results would be? It seems like if they knew what they were making, they would have changed tactics a good ways back, instead of just shutting up before the late Summer launch date. Towards that end, there isn't a good answer, either way: one option, they were trying to hype a dud to get sales before it was a real product; the other, that their organization was not prepared for the needs of producing a high performance x86 CPU, today, and they genuinely thought they had a winner, even as they were testing the first units, that just needed a little tweaking before the real launch.

I also wonder if they have some major improvements in the pipe, since Trinity will suck if it needs even 75% of the power BD does per clock, when Intel will have IB out and mature, by then. There is a point where being willing to sell products cheaper isn't good enough, if they are sufficiently inferior (unless AMD wants to start selling at Harbor Freight :)). If the replacements to Llano suck, they will be hurting worse than usual.


truthfully, i am under the impression that AMD was desperately wanting out. They were posed to sell but every serious contender decided that its not a good time to go up against intel in a market that is peaked and has little growth yr to yr. This didnt stop AMD from believing they could sell. Most of this BD hype was desperate smoke from a company that was looking for an ejection button. If you really look at this company and follow every act its pretty clear.

I heard AMD was real close to selling at good while back, but cold feet interfered. BD was a huge concern. Since then, all AMD did was hype and ploy tactics. I find it terrible that the billion dollars intel gave AMD never found its way into the bulldozer. As a matter of fact, its pretty much the same disaster at 32nm as it was at 45. oops