AMD sheds light on Bulldozer, Bobcat, desktop, laptop plans

Page 5 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Idontcare

Elite Member
Oct 10, 1999
21,118
58
91
deputc, given that we don't even make much of an attempt to characterize single-threaded IPC as a function of instruction set mix (execution speed of some 700+ instructions!?) between any two competing microarchitectures I doubt we will see much progress in the direction you are wanting to see things go by further incorporating the effects of multiple threads operating simultaneously.

And really why bother? We let price be our primary selection criterion anyways and we rely on "performance by app category" (are you encoding? are you using matlab? are you big on cinebench? etc) to guide us towards a price/performance decision point from there.

And lets be honest with ourselves, anything AMD (or Intel) did in the name of reducing the complex issue of performance to a single metric (as in performance ratings) will just be met with harsh criticism and abject skepticism anyways as we are all cynics when it comes to gift horses offered to us by anything with a profit motive.

So really AMD (and Intel) are just better off leaving it up to the reviewers (the ones we view as trustable 3rd parties in this cabal) to evaluate the products and tell us how they perform within each given application category.
 

JFAMD

Senior member
May 16, 2009
565
0
0
The definition of a "core" is blurring and it looks like the industry will now have the unenviable job of accurately portraying performance to the consumer without being able to use the term "core" in an apples-to-apples way. Just as Ghz became a clearly inadequate measure of performance earlier.

Will AMD and Intel have the honesty to come up with new metrics that the average consumer will understand or will they exploit the blurred lines for less-than-100%-accurate marketing purposes? I sincerely hope that both choose the former.

For consumers, the AMD Vision campaign is designed to help do that.

For server customers, typically they get their hands on the hardware and run their apps on them.
 

Martimus

Diamond Member
Apr 24, 2007
4,488
152
106
Whatever AMD is doing here at least we can all agree it is definitely the start of something different, and different can be good if change is what is needed.

What I wonder is if the ratio of 2 to 1 Integer Cores to FP Cores is the right ratio. I wonder if 3-1 or 4-1 or 3-2 would be a more efficient ratio. In this way, GPU's appear more advanced, as their ratio of processor cores seems fixed to a specific expected outcome, while this ratio seems more arbitrary. Of course it only seems arbitrary, since i don't know the reasoning behind it, but I do wonder where this ratio was conceived, and what the numbers to calculate the most efficient ratio were. I think we will see more efficient ratios in the future after this has gone through a few cycles, just like the core ratios on GPU's have gotten far more efficient over the years.
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
Will these Bulldozer CPUs be able to run on existing AM3 boards?
 

Mothergoose729

Senior member
Mar 21, 2009
409
2
81
Will these Bulldozer CPUs be able to run on existing AM3 boards?

From what I have read bulldozer will be released with a new AM3 chipset that has thus far been referred to as Rev2. I think the plan is an intermediary update, like from AM2 to AM2+

On the topic of bulldozer, I think some people here are forgetting that this is AMD's first new architecture since phenom. Phenom II doesn't count because all they did was increase L3 cache density, reduce cache lantensy, and increase clock speed (I suppose you could try and make a case for Athlon II but that doesn't really count either because they feature Phenom II cores). AMD likely won't be able to sweep sandy bridge, but we might see them keep intel's flag ship parts within reach.

Bobcat to me looks like a slam dunk. I have not read anything from intel about adding out-of-order processing, SSE extension, or vertualization to any atom derivitive anywhere. Unless intel can die shrink a core2duo and achieve similar performance and power then bobcat will be a vastly superior chip. Even at a 1ghz clock speed it will thuroughly trounce 2.0ghz Atoms in most cases, and the AMD IGP's are lightyears ahead of intel in terms of raw power and efficiency.
 
Last edited:

cbn

Lifer
Mar 27, 2009
12,968
221
106
From what I have read bulldozer will be released with a new AM3 chipset that has thus far been referred to as Rev2. I think the plan is an intermediary update, like from AM2 to AM2+

So the answer is no?

What confuses me even more is this idea of an integrated GPU. Will this integrated GPU be coming with all Bulldozers or just some of them?

I am wondering if a non-integrated GPU Bulldozer will be compatible with existing AM3 mainboards.
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
Bobcat to me looks like a slam dunk. I have not read anything from intel about adding out-of-order processing, SSE extension, or vertualization to any atom derivitive anywhere. Unless intel can die shrink a core2duo and achieve similar performance and power then bobcat will be a vastly superior chip. Even at a 1ghz clock speed it will thuroughly trounce 2.0ghz Atoms in most cases, and the AMD IGP's are lightyears ahead of intel in terms of raw power and efficiency.

Bobcat is 1-10 watts single core?

Well that definitely is low power consumption. What kind of products will this be going into?
 
Last edited:

veri745

Golden Member
Oct 11, 2007
1,163
4
81
What I wonder is if the ratio of 2 to 1 Integer Cores to FP Cores is the right ratio. I wonder if 3-1 or 4-1 or 3-2 would be a more efficient ratio. In this way, GPU's appear more advanced, as their ratio of processor cores seems fixed to a specific expected outcome, while this ratio seems more arbitrary. Of course it only seems arbitrary, since i don't know the reasoning behind it, but I do wonder where this ratio was conceived, and what the numbers to calculate the most efficient ratio were. I think we will see more efficient ratios in the future after this has gone through a few cycles, just like the core ratios on GPU's have gotten far more efficient over the years.

K10 cores currently have a 128-bit wide FPU. The 256-bit FPU in the bulldozer module looks to be just two FPU's for each of the cores if both are doing FP arithmetic, except it has the advantage of being able to devote the whole thing to a single core if the other is not using it.

So it's really the same ratio as in their current architecture with a potential upside for single-threaded FPU performance.
 

Mothergoose729

Senior member
Mar 21, 2009
409
2
81
So the answer is no?

What confuses me even more is this idea of an integrated GPU. Will this integrated GPU be coming with all Bulldozers or just some of them?

I am wondering if a non-integrated GPU Bulldozer will be compatible with existing AM3 mainboards.

Yes the answer is no. My understanding is that AMD's mainstream answer in Llano will all feature IGP's, similar to the IGP's used in Intel's Clarksdale CPU. This will effectively eliminate the north bridge completely (although I am not sure if they will route the PCIE bus through the chip as well like intel did with P55). I believe, and don't quote me, that the high end zambezi parts will not have IGP's and instead will have increased core counts and possibly higher cache densities. Maybe something similar to athlon II and Phenom II, with the athlon II having the IGP.
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
Yes the answer is no. My understanding is that AMD's mainstream answer in Llano will all feature IGP's, similar to the IGP's used in Intel's Clarksdale CPU. This will effectively eliminate the north bridge completely (although I am not sure if they will route the PCIE bus through the chip as well like intel did with P55). I believe, and don't quote me, that the high end zambezi parts will not have IGP's and instead will have increased core counts and possibly higher cache densities. Maybe something similar to athlon II and Phenom II, with the athlon II having the IGP.

So if they are eliminating the northbridge maybe we will see some good offerings in mini-DTX along with some switchable graphics.
 
Last edited:

DrMrLordX

Lifer
Apr 27, 2000
21,632
10,845
136
Bobcat to me looks like a slam dunk. I have not read anything from intel about adding out-of-order processing, SSE extension, or vertualization to any atom derivitive anywhere. Unless intel can die shrink a core2duo and achieve similar performance and power then bobcat will be a vastly superior chip. Even at a 1ghz clock speed it will thuroughly trounce 2.0ghz Atoms in most cases, and the AMD IGP's are lightyears ahead of intel in terms of raw power and efficiency.

Intel has had ULV Core2Duo parts for some time, though I'm not sure if they have bothered producing anything sub-10W yet. As with any post-Netburst s775 chip (or any derivative thereof), the real issue is going to be chipset power draw moreso than processor power draw. Bobcat will, presumably, be able to do things on-die within its stated power envelope that a die-shrunk ULV Core2Duo could not, hence the need for on-board northbridge and igp with any Core2Duo laptop (and all the power draw that would come with such things).

If anything, I would expect Intel to fight back with a mobile i3 variant or . . . something. Atom is more intended to go head-to-head with ARM chips so I wouldn't expect Intel to use it against Bobcat.
 

Mothergoose729

Senior member
Mar 21, 2009
409
2
81
Intel has had ULV Core2Duo parts for some time, though I'm not sure if they have bothered producing anything sub-10W yet. As with any post-Netburst s775 chip (or any derivative thereof), the real issue is going to be chipset power draw moreso than processor power draw. Bobcat will, presumably, be able to do things on-die within its stated power envelope that a die-shrunk ULV Core2Duo could not, hence the need for on-board northbridge and igp with any Core2Duo laptop (and all the power draw that would come with such things).

If anything, I would expect Intel to fight back with a mobile i3 variant or . . . something. Atom is more intended to go head-to-head with ARM chips so I wouldn't expect Intel to use it against Bobcat.

Atom cannot compete against ARM, their chips are already becoming inferior in every respect. In 2011 a dual core ARM with a 1ghz+ clock speeds, and TDP's of a quater of a watt will be available and for even less cost. Even Tegra is better then the HD4500 or any other intel graphics. Bobcat in the higher clocked version can compete against mobile celerons, but I see a separation happening in the mobile market; people will buy their fully functional core2's (or i3 then) or they will be the ultra mobile. Now intel will still get far more sales then AMD no matter how inferior their product, but if AMD can really deliver with this chip and their is no way the current Atom architecture can come close to a combination of competent CPU power coupled with good, dx11 IGP.
 

JFAMD

Senior member
May 16, 2009
565
0
0
What I wonder is if the ratio of 2 to 1 Integer Cores to FP Cores is the right ratio. I wonder if 3-1 or 4-1 or 3-2 would be a more efficient ratio. In this way, GPU's appear more advanced, as their ratio of processor cores seems fixed to a specific expected outcome, while this ratio seems more arbitrary. Of course it only seems arbitrary, since i don't know the reasoning behind it, but I do wonder where this ratio was conceived, and what the numbers to calculate the most efficient ratio were. I think we will see more efficient ratios in the future after this has gone through a few cycles, just like the core ratios on GPU's have gotten far more efficient over the years.

Yes, that is a very good question. In talking to customers, their applications typically fall into 2 camps. ~80-90% fall into the "mainly integer" camp where they have little or no FP in their code, so 2:1 is a definite improvement because they save power. Going to less than that (like 3:1 or 4:1) *might* adds too much latency because the FPU could get more traffic lining up behind it. If the results for an integer thread rely on a pre-determined result from the FPU, everything would slow down (but I am not a designer, I am guessing here.)

The other applications (the 10-20% or so) are heavily FPU, and for those customers, they are actually more concerned about getting GPUs in their system for massive FP calculations. In this case they are also thinking 2:1, but 2 GPUs for every CPU.

Over time, as the software becomes more sophisticated, the desire for FPU inside the system could potentially shrink more in favor of GPU to do that work. But we still have a while on that trend becoming mainstream.

And again, I am a marketing guy, but that is the typical conversation that I have with customers.
 

evolucion8

Platinum Member
Jun 17, 2005
2,867
3
81
Even Tegra is better then the HD4500 or any other intel graphics.

How can that be possible? None of the Tegra series including the APX 2x00 series, or the T egra 6x0 series have a very powerful dedicated GPU because it would kill its purpose as an ultra efficient technology in regard of power consumption. It doesn't even support full fledged OpenGL or Direct3D, only their mobility counterparts. I couldn't find information regarding of which GPU derivative had, but for sure isn't powerful enough to even outperform an HD 3200 GPU or nVidia similar cards.
 

deputc26

Senior member
Nov 7, 2008
548
1
76
deputc, given that we don't even make much of an attempt to characterize single-threaded IPC as a function of instruction set mix (execution speed of some 700+ instructions!?) between any two competing microarchitectures I doubt we will see much progress in the direction you are wanting to see things go by further incorporating the effects of multiple threads operating simultaneously.

And really why bother? We let price be our primary selection criterion anyways and we rely on "performance by app category" (are you encoding? are you using matlab? are you big on cinebench? etc) to guide us towards a price/performance decision point from there.

And lets be honest with ourselves, anything AMD (or Intel) did in the name of reducing the complex issue of performance to a single metric (as in performance ratings) will just be met with harsh criticism and abject skepticism anyways as we are all cynics when it comes to gift horses offered to us by anything with a profit motive.

So really AMD (and Intel) are just better off leaving it up to the reviewers (the ones we view as trustable 3rd parties in this cabal) to evaluate the products and tell us how they perform within each given application category.

I fully agree with what you are saying, I didn't state my concern clearly and in retrospect I was beating around the bush a bit.

When a company misrepresents, even a little, the performance of a product; they invariably profit in the short term but receive backlash from the reviewers (and opinion leaders, us) which has a long term and very-difficult-to quantify but very real effect on not only sales but also the reputation of the company and by extension the industry. I tried to be brand agnostic in my first post but I am worried that AMD will advertise "more cores than Intel" which will fool the average consumer into thinking that this means "better than Intel" (and maybe it will be) while really each bulldozer "module" should *honestly* be referred to as just 1 really awesome core or maybe 1.5 cores. I really do wish the best for AMD and I don't want to see them lose credibility with short-sighted marketing.
 
Last edited:

JFAMD

Senior member
May 16, 2009
565
0
0
Benchmarks with raw performance are always misleading; they are not exact and do not necessarily scale out to all configurations and all environments.

TPC-C for instance is supposed to show database performance. But the configurations are for ~1200 hard drives, each with a small stripe of data for performance only. Is that realistic? VMmark shows hundreds of virtual machines with less than 1GB memory per machine when the average system is 8-10. Is that realistic?

I prefer looking at actual performance in actual systems. Plus you need to look at power and price.
 

Mothergoose729

Senior member
Mar 21, 2009
409
2
81
How can that be possible? None of the Tegra series including the APX 2x00 series, or the T egra 6x0 series have a very powerful dedicated GPU because it would kill its purpose as an ultra efficient technology in regard of power consumption. It doesn't even support full fledged OpenGL or Direct3D, only their mobility counterparts. I couldn't find information regarding of which GPU derivative had, but for sure isn't powerful enough to even outperform an HD 3200 GPU or nVidia similar cards.

I meant in terms of efficiency. Despite being so low power and lacking a lot of features, it is still able to do 720p decoding, and future releases will have full 1080p playback. Without future updates it won't be long at all before the HD4500 is inferior to everything, none the less the 945GSE. I guess the point I was trying to make is that Intel's current Atom+945GSE package is no longer going to cut it, even for the netbook crowd. Pineview might have a better IGP then current ITX chipsets, but I have to wonder how much power they can pack on an intel chip, with crappy intel graphics technology.
 

Idontcare

Elite Member
Oct 10, 1999
21,118
58
91
I fully agree with what you are saying, I didn't state my concern clearly and in retrospect I was beating around the bush a bit.

When a company misrepresents, even a little, the performance of a product; they invariably profit in the short term but receive backlash from the reviewers (and opinion leaders, us) which has a long term and very-difficult-to quantify but very real effect on not only sales but also the reputation of the company and by extension the industry. I tried to be brand agnostic in my first post but I am worried that AMD will advertise "more cores than Intel" which will fool the average consumer into thinking that this means "better than Intel" (and maybe it will be) while really each bulldozer "module" should *honestly* be referred to as just 1 really awesome core or maybe 1.5 cores. I really do wish the best for AMD and I don't want to see them lose credibility with short-sighted marketing.

Yep, this will transition from a "core war" to a "thread war"...which is all it ever really was to begin with but since cores were synonymous with threads in x86 world until the advent of hyperthreading there was no penalty for failing to make the distinction.

What hyperthreading and modules are going to do is push the vernacular away from "core performance" to "thread performance". The metrics remain the same, we just shift the labels appropriately so as to more correctly refer to what matters.

We've never really generalized the term "core" enough to begin with anyways for even the label "cores" to make much sense when comparing two disparate microarchitectures. We ended up with core vs uncore, northbridge, IMC, shared L3$, etc...all these labels to try and make it more clear what wasn't in the "core" while at the same time never really getting down to brass tax and spelling out explicitly what is in the core as a definition.

Is a 3-stage scheduler with a 2-stage decoder and single-fetch "core" worthy of the label "core" when compared to a 5-stage scheduler with a 3-stage decoder and a dual-stage fetch "core"? (I am simply making up stage counts here to make a point, no architecture really has these configs)

All these stages and issuers, etc, are just resources available to improve the sustainability of a given instructions IPC within an executing thread. We care about thread count and performance of that thread...knowing intimate details of the architecture that enables the ISA and the performance is nice but at the end of the day we (the consumers) should have never allowed ourselves to get swept up in categorizing microarchitectural subset features as belonging to "cores".

Granted it doesn't help that the world+dog uses the vernacular "core" to describe sets of discrete functional hardware resources in a given microarchitecture, including the engineers themselves. But this vernacular is undoubtedly going to give way to the better descriptive label of "thread count" in place of "core count" IMO.
 

Fox5

Diamond Member
Jan 31, 2005
5,957
7
81
Bobcat is 1-10 watts single core?

Well that definitely is low power consumption. What kind of products will this be going into?

130nm Athlon XPs managed sub 10W power consumption...if you undervolted and underclocked them as far as they go. (300Mhz and they'll run passively cooled at that)
A low power cpu isn't worth much without knowing its performance profile.

A single core Athlon 64, undervolted and underclocked, will run at 800Mhz and use around 8W iirc. An Athlon X2 roughly doubles that. There was actually an Oqo sized device that used an Athlon X2, undervolted and underclocked to 800mhz.
This http://cgi.ebay.com/Everun-Note-dua...aptops_Nov05?hash=item2ea92fbbe5#ht_962wt_927

AMD could probably produce sub 10W processors right now on its 45nm process, but the performance class might not warrant the cost to produce (though somehow Intel gets through it with its LV and ULV chips). To get down to an Atom level of power consumption (4W per core), they'd probably have to go down to an Atom level of performance, but at a much larger die size.
 

Martimus

Diamond Member
Apr 24, 2007
4,488
152
106
Yes, that is a very good question. In talking to customers, their applications typically fall into 2 camps. ~80-90% fall into the "mainly integer" camp where they have little or no FP in their code, so 2:1 is a definite improvement because they save power. Going to less than that (like 3:1 or 4:1) *might* adds too much latency because the FPU could get more traffic lining up behind it. If the results for an integer thread rely on a pre-determined result from the FPU, everything would slow down (but I am not a designer, I am guessing here.)

The other applications (the 10-20% or so) are heavily FPU, and for those customers, they are actually more concerned about getting GPUs in their system for massive FP calculations. In this case they are also thinking 2:1, but 2 GPUs for every CPU.

Over time, as the software becomes more sophisticated, the desire for FPU inside the system could potentially shrink more in favor of GPU to do that work. But we still have a while on that trend becoming mainstream.

And again, I am a marketing guy, but that is the typical conversation that I have with customers.

I appreciate your response. If you were the engineer in charge of this decision, I would have the answer I was looking for: it was an arbitrary change. However, since you are a self proclaimed "marketing guy", it makes sense that you wouldn't know the actual reasoning behind the specific ratio chosen.

This change appears to be an obvious one to me, and considering how many architectures have had multiple cores, it is about the right time to start differentiating the types of processing can be done in each. I am sure Intel will do something similar in the future as well, although I think AMD has the advantage of having experience with ATI in making these changes for this first go around. This last reason is the one that leads me to believe this ratio is not arbitrary, but since it is in such small numbers I doubt it is as optimized as the engineers would like. (You can't make a 1 to 0.65 ratio for instance, because you only have a small number of total cores).
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
Yep, this will transition from a "core war" to a "thread war"...which is all it ever really was to begin with but since cores were synonymous with threads in x86 world until the advent of hyperthreading there was no penalty for failing to make the distinction.

What hyperthreading and modules are going to do is push the vernacular away from "core performance" to "thread performance". The metrics remain the same, we just shift the labels appropriately so as to more correctly refer to what matters.

Then the question becomes what is the computational power available per thread?

Something tells me it is more power efficient to divide total computational power into more and more threads (up to a point). If this is true then practical computational power (for a light user like me) becomes relatively low if the CPU is too big for my needs.

For this reason I almost wish AMD would release a dual core 32nm Bulldozer. With this 80% scaling hyperthreading such a CPU would almost act like a native quad core.
 
Last edited:

Idontcare

Elite Member
Oct 10, 1999
21,118
58
91
Then the question becomes what is the computational power available per thread?

Something tells me it is more power efficient to divide computational power into more and more threads (up to a point). If this is true then practical computational power (for a light user like me) becomes low unless the processor is relatively small.

Given the nature of the equations of the device physics that underlie power-consumption, current CMOS-based xtor processors most certainly benefit from operating at the lowest clockspeed and lowest Vcc possible while having more of those xtors working in parallel if necessary to get a particular calculation done per unit time.

Consider that for a given Vcc and clockspeed the power-consumption scales linearly with xtor count (assuming replication of active architectural functions of course) whereas increasing Vcc and frequency results in a cubic relationship to power consumption.

So if given the choice of doubling the number of cores (threads) to get 2x the work done per unit time (assuming we have a coarse-grained dual-threaded application in mind) or increasing the clockspeed and Vcc as needed (will need to be >2x as the fixed memory subsystem latency hinders clockspeed scaling efficiency) so we can execute and retire both threads serially in the same period time on a single core (thread) processor we would always choose the higher thread/core count processor if power-consumption were our primary concern.

However doubling the xtor count means cost of production increases (and non-linearly so to the disfavor of larger die), whereas doubling the operating frequency of the xtors does not necessarily incur a doubling of the production costs...so from a design and manufacturing viewpoint simply making super low clockspeed CPU's the size of wafers for use on massively threaded applications isn't exactly the path to pots of gold either.

For this reason I almost wish AMD would release a dual core 32nm Bulldozer. With this 80% scaling hyperthreading such a CPU would almost act like a native quad core.

A dual-core Zambezi (that is a CPU with one bulldozer module) with 80% thread scaling means it almost acts like a native dual core cpu.

The quad-core/quad-thread zambezi (2 BD modules) would be compared to a dual-core/quad-thread clarkdale (2 westmere cores with HT).

Likewise the octo-core/octo-thread zambezi (4 BD modules) would be compared to the quad-core/octo-thread sandy-bridge.