Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

Page 63 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Status
Not open for further replies.

Rossini

Junior Member
Oct 21, 2010
22
0
0
Thinking about building a rig here in a couple of months.

Would it be worth waiting for BD? Or just stick with Sandy Bridge?
 

SickBeast

Lifer
Jul 21, 2000
14,377
19
81
Thinking about building a rig here in a couple of months.

Would it be worth waiting for BD? Or just stick with Sandy Bridge?
It's hard to tell. Bulldozer definitely looks interesting at this point. I would be surprised to see it perform worse than an i7, particularly when it comes to server workloads.
 

nonameo

Diamond Member
Mar 13, 2006
5,902
2
76
Thinking about building a rig here in a couple of months.

Would it be worth waiting for BD? Or just stick with Sandy Bridge?

While I don't particularly love intel( I don't love AMD either) I think it is safe to say that you cannot go wrong(performance wise) with a 2500k or 2600k. It's not like bulldozer is going to be half the price and double the performance. At worst, you overspend a bit. But hey, high end computer stuff devalues quickly anyway. The only reason I'd wait for bulldozer is if you do things that are highly multithreaded.
 

Ajay

Lifer
Jan 8, 2001
16,094
8,114
136
So what should a Bulldozer Version 2 look like to fix the upcoming problems of AMD?

1.) Consideration. How fix CMT.
The only advantage I would see in CMT would be the use of a vector unit by two cores. So there is either the option to get away from CMT or to extend it to having two decoder units and 2 I-caches. But then adding another vector unit would be little more and would reduce all special handling because of CMT.
2.) Implement SMT (cost little/gains a lot)
3.) Fix Integer SSE
4.) Add a ALU unit/pipe to integer core (cost about nothing, gains like hell)
5.) Reconsider high frequency design - really worth?
6.) Get the abnormal high uncore die consumption fixed.

Some of which is probably coming w/BD II @ 22nm (when ever that will actually be available).
 

podspi

Golden Member
Jan 11, 2011
1,982
102
106
Given JFAMD's insistence that IPC increases, I'm still going to consider Stars' IPC to be the lower-bound on credible IPC estimations...


Even Phenom II @ 4.5ghz base would be impressive, though. That would be 5ghz minimum turbo under most conditions :D
 

nonameo

Diamond Member
Mar 13, 2006
5,902
2
76
Given JFAMD's insistence that IPC increases, I'm still going to consider Stars' IPC to be the lower-bound on credible IPC estimations...


Even Phenom II @ 4.5ghz base would be impressive, though. That would be 5ghz minimum turbo under most conditions :D

I don't know, intel has had a lot of time to work on their hkmg and the 2600k is only running at 3.4ghz standard clock. I'd be surprised if bulldozer had a much higher base clock.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
I don't know, intel has had a lot of time to work on their hkmg and the 2600k is only running at 3.4ghz standard clock. I'd be surprised if bulldozer had a much higher base clock.

2600K is a 95W TDP SKU.

AMD could roll out a 140W TDP SKU, they've had such a TDP tier for years, that outclocks Intel's 95W TDP SKU if they wanted.

Question I have is what response would Intel take if that were to occur.
 

HW2050Plus

Member
Jan 12, 2011
168
0
0
Given JFAMD's insistence that IPC increases, I'm still going to consider Stars' IPC to be the lower-bound on credible IPC estimations...


Even Phenom II @ 4.5ghz base would be impressive, though. That would be 5ghz minimum turbo under most conditions :D
This 4.5 GHz was the actual clock, means Turbo for all cores included and not on top.

And despite JFAMD's insistance that IPC increases I doubt that. I not only doubt that regarding the latest information from AMD from the developer manual and others before I do not see any possibility for an IPC increase. However they made a lot of good things to keep the IPC decrease to a minimum but I included that already in the calculations.

Again there is no problem in Bulldozer itself. The problem is a die space consumption vs. chosen design issue. Bulldozer could be great if it would use 200 mm². And it is also a TDP question. This is still open though as we do now really nothing about TDP. But the Bulldozer design could be a TDP issue as well. It could also turn out that it is a stroke of genious regarding TDP, we have to see.

2600K is a 95W TDP SKU.

AMD could roll out a 140W TDP SKU, they've had such a TDP tier for years, that outclocks Intel's 95W TDP SKU if they wanted.

Question I have is what response would Intel take if that were to occur.
The response is already in the pipe and it is Sandy Bridge EN. This part will easily outperform the 4.5 GHz clocked Zambezi. And for that it needs only little clock. Problem is that AMD needs a lot of clock to compensate for the slower cores.

AMD will do fine in rest of 2011 and maybe in Q1/2012 but then it is already over and they will again struggle hard to keep beeing in competition. At least regarding the None-APU market. In the APU market it looks great.

The Bobcat core looks welldone and the Llano will be another success. Reason for the success of Llano will be because it does NOT use the Bulldozer core.
 

Riek

Senior member
Dec 16, 2008
409
15
76
This 4.5 GHz was the actual clock, means Turbo for all cores included and not on top.

And despite JFAMD's insistance that IPC increases I doubt that. I not only doubt that regarding the latest information from AMD from the developer manual and others before I do not see any possibility for an IPC increase. However they made a lot of good things to keep the IPC decrease to a minimum but I included that already in the calculations.

Again there is no problem in Bulldozer itself. The problem is a die space consumption vs. chosen design issue. Bulldozer could be great if it would use 200 mm². And it is also a TDP question. This is still open though as we do now really nothing about TDP. But the Bulldozer design could be a TDP issue as well. It could also turn out that it is a stroke of genious regarding TDP, we have to see.


The response is already in the pipe and it is Sandy Bridge EN. This part will easily outperform the 4.5 GHz clocked Zambezi. And for that it needs only little clock. Problem is that AMD needs a lot of clock to compensate for the slower cores.

AMD will do fine in rest of 2011 and maybe in Q1/2012 but then it is already over and they will again struggle hard to keep beeing in competition. At least regarding the None-APU market. In the APU market it looks great.

The Bobcat core looks welldone and the Llano will be another success. Reason for the success of Llano will be because it does NOT use the Bulldozer core.

As long as the official point is IPC BD > Deneb core, your assumptions, calculations and whatever are worthless.

What the pdf showed,(if that part was correct) is that one BD core has a more execution resources than the previous design.. which was the basis of the lower performance statements before that knowledge.

(that is besides the load store capabilities of one core = one sb core)


edit: You call this math????

BDPerf = PIIPerf * 0.8 // -Reduction in core capability, +Core Improvements
BDPerf = PIIPerf * 0.8 * 1.2 // + Higher clock (4.5 GHz), - cost of high freq. design
BDPerf = PIIPerf * 0.8 * 1.2 * 1.8 // CMT
results in:
BDPerf = PIIPerf * 1,728
means a Bulldozer is 1.7 to 1.8 times faster than a Phenom II


i will not comment on your jubberisch data but basically you say:

BD performance 80% due to reduction in core capabilities and this includes also possible improvements.

Then you say Bd = that performance * frequency advantage - the cost of a high freq design.. So basically you nerf the design choices two times? once for the changed core where you use a jubberish number and later again because ... ?
 
Last edited:

podspi

Golden Member
Jan 11, 2011
1,982
102
106
The problem is a die space consumption vs. chosen design issue. Bulldozer could be great if it would use 200 mm². And it is also a TDP question. This is still open though as we do now really nothing about TDP. But the Bulldozer design could be a TDP issue as well. It could also turn out that it is a stroke of genious regarding TDP, we have to see.


Bulldozer is coming in a little bit larger than I would have expected. I find it hard to believe they would have worked on the Bulldozer design this long just to end up with a product that fails on some of the most important metrics, size (cost of manufacture) and IPC. Bulldozer is supposedly a "speedracer" design, but we all know what happens if you just go for "Speedracer" while ignoring everything else :D . Given Intel's ample headroom (and the expectation that they would have such headroom) it is hard to believe AMD would paint themselves into such a corner (unless BD ends up being able to clock stupid high).

Second, if they decided to throw singlethread performance to the wolves and focus on throughput, you could make the argument that slower (but smaller) cores made sense, but Bulldozer's cores aren't small enough (imho) to indicate that is the direction AMD decided to go.

I think a more likely scenario is that IPC has increased (No idea how much, hence the lower-bound to Stars), and AMD has used the module technique to keep die-sizes under control. My intuition (and nothing else, so don't put too much stock into it) is that Bulldozer will achieve Nehalem-level IPC. 8 Nehalem cores would be much larger than Zambezi @ 32nm, and so Bulldozer should be a success.


Of course, if HW2050Plus is correct now we know why Dirk got the boot D: But I don't think so...
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
And despite JFAMD's insistance that IPC increases I doubt that. I not only doubt that regarding the latest information from AMD from the developer manual and others before I do not see any possibility for an IPC increase. However they made a lot of good things to keep the IPC decrease to a minimum but I included that already in the calculations.

Presumably AMD is making their statements from a position of information and data given that they actually have all the requisite information at their disposal when making their statements.

However you are making your statements from a position of known ignorance as surely you aren't about to claim to know as much about Bulldozer's microarchitecture and performance as AMD does.

I don't get it, you are clearly an industrious individual and are adept at logical reasoning, if you have a boundary condition (BD IPC > PhII IPC) then you should endeavor to drill down through the logic tree of what all that must imply about the microarchitecture (while accounting for the pubic information) which must be true so as to observe the boundary condition in a self-consistent manner.

You have gone to great lengths to identify and characterize all the "1 step back" attributes of the architecture, now endeavor to identify what that must imply about all the unpublished "2 steps forward" attributes of the same architecture if, in the end, they ended up being a net "1 step ahead" when it was all said and done.

Inductive reasoning has its merits when one is operating in a data deficient environment where the outcome is a known (BD IPC > PhII IPC) but the supporting cause-and-effect information is lacking (how is it possible!?).
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
On one side, you have people believing benchmarks that show 2x gains at the same core count, then you have this(referring to the comparison made with Llano), which is almost as ridiculous.

(No real point to this post, just saying. :hmm:)
 

Ajay

Lifer
Jan 8, 2001
16,094
8,114
136
So, again, not so surprisingly, we will have to wait till there are some reasonably well documented benchmarks leaked b/4 we will have any real clue as to the true potential performance of BD :|.

Merry-go-rounds are fun - unless you stay on too long and toss your cookies.
 

Janooo

Golden Member
Aug 22, 2005
1,067
13
81
...

4 BD module = 120 mm²
8 MB L3 cache = 60 mm²
Uncore = 100 mm²
~280 mm² in total

So let's see what Llano could do with that die space:
8 Llano cores = 80 mm²
12 MB L3 cache = 90 mm²
Uncore = 100 mm²
~270 mm²

...
So one Llano core is 10mm2 and one BD core is 15mm2?
 

HW2050Plus

Member
Jan 12, 2011
168
0
0
i will not comment on your jubberisch data but basically you say:

BD performance 80% due to reduction in core capabilities and this includes also possible improvements.

Then you say Bd = that performance * frequency advantage - the cost of a high freq design.. So basically you nerf the design choices two times? once for the changed core where you use a jubberish number and later again because ... ?
Yes that is because:
1.) They reduced the integer core from a 3 pipeline design (PII) to a 2 pipeline design (BD). This is somehow a new information which I assembled from information pieces of decoder and the new AMD optimization manual. They have 4 pipelines of NOW yops which are able to process 2 MacroOPS. In PII we have 3 pipelines of MacroOPS which are able to process 3 MacroOPS.
2.) They increased instruction latencies to enable a high frequency design.

There are two different issues present which reduce IPC/core.

Of course I cannot say if 80% is correct or it is e.g. 90% or even 70%. The point is the two issues above plus what we know about the die size. That is less throughput and the reduced throughput also slower! I cannot imagine any trick how they could compensate this reagrding IPC. So they did is with what is left -> high frequency.

And besides that I don't care what AMD marketing says about AMD processors. I know this and I am writing this while knowing those statements. All that taking information from AMD engineers. Either the info from the optimization manual or the official statement of an AMD engineer at ISSCC stating that IPC goes down. Several pages before around ISSCC you can see a post where I confronted JFAMD with the statement of one of their engineers. And that was the engineer presenting the BD module at ISSCC.

As long as the official point is IPC BD > Deneb core, your assumptions, calculations and whatever are worthless.
As you see the OFFICIAL point is that IPC Deneb > BD!
As JFAMD posts all the time. His statements are his private view!
 
Last edited:

HW2050Plus

Member
Jan 12, 2011
168
0
0
If you don't see anything wrong with that then OK.
If you keep in mind that such a core from a half module cannot work without the other half of the module (second core) then it is okay. Where e.g. with Llano you could build a single (3, 5, 7) core system.
 

Janooo

Golden Member
Aug 22, 2005
1,067
13
81
If you keep in mind that such a core from a half module cannot work without the other half of the module (second core) then it is okay. Where e.g. with Llano you could build a single (3, 5, 7) core system.
Why would AMD bother with BD module/core stuff if one core was bigger than the old one?
 

Abwx

Lifer
Apr 2, 2011
11,885
4,873
136
According to some posters,

1. AMD is stupid , of course.
2. AMD is replacing K10 with a less performing CPU.
3. Of course, AMD think that no one will notice about it.
4. AMD s true goal is to be bankrupt as soon as possible.
 

formulav8

Diamond Member
Sep 18, 2000
7,004
523
126
According to some posters,

1. AMD is stupid , of course.
2. AMD is replacing K10 with a less performing CPU.
3. Of course, AMD think that no one will notice about it.
4. AMD s true goal is to be bankrupt as soon as possible.


Yuk Yuk Yuk :D
 
Status
Not open for further replies.