Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

drizek · May 3, 2011

krumme said:
One module is 31mm2, what is the rest?
As the l2 is quite large, will we see quad cores sans l3?
Can there be different l3 types?

Those questions have already been asked and... we pretty much don't know.

I'm wondering how the 10-core version will work. Where will they put the extra module? WIll they put it in the middle and then have L3 at its cardinal points?

fire400 · May 3, 2011

cooling these things in workstation laptops with OC function.. hehe

RobertPters77 · May 3, 2011

I'm wondering if the BD module is actually a single super core that can partition itself into two 'lesser' cores. And likewise if the Modules on the die can merge into one hyper core.

I'd be surprised if it could do that. But I'm not expecting it to do so.

JFAMD · May 4, 2011

RobertPters77 said:
I'm wondering if the BD module is actually a single super core that can partition itself into two 'lesser' cores. And likewise if the Modules on the die can merge into one hyper core.

I'd be surprised if it could do that. But I'm not expecting it to do so.

In a word, "no"

RobertPters77 · May 4, 2011

"No" in what sense?

itsmydamnation · May 4, 2011

RobertPters77 said:
"No" in what sense?

in a sense that each core has its own L1D and its own scheduler and its own set of int pipelines.

Cerb · May 4, 2011

RobertPters77 said:
"No" in what sense?

A 4-way superscalar OoO ALU with deep pipelines and multithreading? Talk about complexity. Even worse, the vast majority of code tends towards towards 1 IPC (COTS software, straight from a compiler, not an assembly line or intrinsic in sight), and more threads sharing execution resources means needing more register and cache resources, making it a simpler trade-off to have more simpler cores, even at the expense of the occasional high-IPC code (MLP is the Way of the Future(tm), and we are going to get there by pounding square ISAs into round holes 🙂). High-IPC code, OTOH, can always benefit from having the extra instructions issued the next cycle, which BD appears to be betting on. I wouldn't be surprised, as well, if keeping enough instructions issuing at any given time has to do with the fast caches, too (not "Intel" fast, but fast instead of dense).

Dresdenboy · May 4, 2011

krumme said:
One module is 31mm2, what is the rest?
As the l2 is quite large, will we see quad cores sans l3?
Can there be different l3 types?

Did you try my link? There are: L3 caches, DDR3 interface pads, 4 HT links, a northbridge, some PLLs, and a lot of other area, possibly not used or only used for wiring (not seen in the shown layers).

krumme · May 4, 2011

Dresdenboy said:
Did you try my link? There are: L3 caches, DDR3 interface pads, 4 HT links, a northbridge, some PLLs, and a lot of other area, possibly not used or only used for wiring (not seen in the shown layers).

Yes. I was wondering how small the bd could become on 32nm?, as i would say something about what market it could adress.

Dresdenboy · May 4, 2011

krumme said:
Yes. I was wondering how small the bd could become on 32nm?, as i would say something about what market it could adress.

A 2 module quad core with less HT-links and 4MB L3 might land well below 140mm^2.

Ajay · May 4, 2011

More from Donanim Haber:

http://www.donanimhaber.com/islemci/haberleri/iste-AMDnin-8-cekirdekli-Bulldozer-FX-islemcisi-icin-test-sonuclari.htm

Martimus · May 4, 2011

That says that the ambiguous 8 Core FX processor will have an approximately equivalent PCMARK Vantage score to the i7-2600K processor.

I really have no idea what that means for other applications (I don't use PCMARK Vantage). Does anyone else have an idea what the Vantage score equates to in other actually functional applications?

nonameo · May 4, 2011

Wow... and using a discrete graphics card with it to make the bar longer. That's shady. I don't like that 🙁

edit: however, note that denebs have have trouble even keeping up with the dual core sandy bridges in pcmark, so if this is true, it probably bodes well for other benchmarks. Also, considering that thuban does not do any better than deneb(well, deneb has a 200mhz advantage though...) I think it is safe to say that pcmark is not benefiting much from those extra 2 cores on the thuban. So, perhaps bulldozer is keeping up with sandy on the IPC level...

And yeah, I know it's just a synthetic benchmark but I wanna dream so XD entertain me 😛

Also, 4 core bulldozer should be really interesting if it can deliver most of the performance of an 8 core bulldozer in apps that only use 4 or less threads.

I know it's toms but I couldn't find any vantage benchmarks on AT

http://www.tomshardware.com/reviews/core-i7-990x-extreme-edition-gulftown,2874-4.html

http://www.tomshardware.com/reviews/sandy-bridge-core-i7-2600k-core-i5-2500k,2833-12.html

Mugenx · May 4, 2011

nonameo said:
Wow... and using a discrete graphics card with it to make the bar longer. That's shady. I don't like that 🙁

I noticed it too.

@that test. They might have used a discreet card with the intel procs too but AMD forgot to list it, no?

is AMD saying on that slide that AMD proc + discreet >>>> Intel proc + IGP?

(sic)Klown12 · May 4, 2011

They had to use a discrete with the Bulldozer chips since they're the only ones listed that don't have an integrated GPU. If they were trying to completely mislead people, they would have also used a discrete GPU with the Llano chips and used a high-end GPU with the FX.

dma0991 · May 4, 2011

A 8 core Bulldozer just to match the score of a 4 core Sandy? I was expecting better results and adding a discrete GPU to the test and making the bar longer is just misleading. :\

podspi · May 4, 2011

dma0991 said:
A 8 core Bulldozer just to match the score of a 4 core Sandy? I was expecting better results and adding a discrete GPU to the test and making the bar longer is just misleading. :\

I don't think the benchmark scales above four threads so the number of threads/cores doesn't really come into this. This is excellent news if true, since BD would be faster than the 2600k (in throughput) even if singlethread perfromance didn't change from Stars.

I think AMD should have used a 4290 in those benchmarks but... It isn't supposed to be public marketing slides anyway, so I won't complain too loudly yet.

(sic)Klown12 · May 4, 2011

podspi said:
I don't think the benchmark scales above four threads so the number of threads/cores doesn't really come into this. This is excellent news if true, since BD would be faster than the 2600k even if singlethread perfromance didn't change from Stars.

I think AMD should have used a 4290 in those benchmarks but... It isn't supposed to be public marketing slides anyway, so I won't complain too loudly yet.

That's what I've gathered too. I don't know why they didn't use the new 3D Mark 11 which has much better CPU scaling when running the CPU physics test.

smartpatrol · May 4, 2011

BREAKING NEWS:

AMD Bulldozer CPU + discrete Radeon 6670 beats Sandy Bridge with integrated graphics. You heard it here first.

Okay, the PCMark score is impressive, but that graph is extremely misleading.

Martimus · May 4, 2011

The other thing that shows is that the A8 (Llano) is only ~60% as fast as a i7-2600K or 8 core FX CPU. It also shows that a A8 is only about 70% as fast as a 1100T Thuban CPU. That is somewhat disappointing about the fastest Llano ship.

Idontcare · May 4, 2011

Martimus said:
The other thing that shows is that the A8 (Llano) is only ~60% as fast as a i7-2600K or 8 core FX CPU. It also shows that a A8 is only about 70% as fast as a 1100T Thuban CPU. That is somewhat disappointing about the fastest Llano ship.

I don't believe the graph. Doesn't pass the sniff test which includes the observations you are making here.

dma0991 · May 4, 2011

podspi said:
I don't think the benchmark scales above four threads so the number of threads/cores doesn't really come into this. This is excellent news if true, since BD would be faster than the 2600k (in throughput) even if singlethread perfromance didn't change from Stars

What you said makes sense as the i3 2100 with 2C/4T is not that far behind compared to the Core i7 2600K which means that the test is somewhat limited to 4 threads/cores.

HW2050Plus · May 4, 2011

Martimus said:
Well, it is still smaller than the Thuban, which is said to be 346 mm²: http://en.wikipedia.org/wiki/Phenom_II

Sure it is smaller, it is on a new process on 32 nm SOI. There is no problem in the absoulte die size itself (my estimations for 8 core Zambezi are 280 mm², from the same die shot others get 292/294). The problem arises from the die size related to performance. The key point is that Intel's Sandy Bridge is extremly small. That means they can just add more cores if it is needed. In that respect AMD has some more troubles with Bulldozer. Even more it appears that Bulldozer's die size is unnecessarily large likly because they use standard cells for uncore or something like this, it is not optimized. And the smaller the die the better the yield. Last but not least the die size influences customer prices.

Martimus said:
I have to say that the clock speeds are disappointing if it really is that small and 3.1GHz is really the top speed bin, since it is smaller than Thuban and designed to clock higher. I am sure it is just the immaturity of the process though, and we will likely see the speed ramp up relatively high over the next few years because of it.

Where you got this 3.1 GHz? I can hardly imagine that BD will clock that low! AMD already talked quite early about 3.5+ GHz. I am still predicting 4.5 GHz (including TURBO of course).

Martimus said:
I still take it all with a grain of salt though. I'll see how fast it really is when it is released. I am still exctied to see it perform, and hope it really is released at Computex in June.

Yes will get interesting then, because likly it will shine in memory intense and FP-SSE whereas it will look not that good in pure integer or branch intense.

Vesku · May 4, 2011

HW2050Plus said:
Yes will get interesting then, because likly it will shine in memory intense and FP-SSE whereas it will look not that good in pure integer or branch intense.

I thought they put more focus into Bulldozer's integer performance since that's the majority of the server market. On the fpu front they are pushing heterogeneous computing, leveraging their GPU product line, for customers that are mainly concerned with fpu performance.

formulav8 · May 4, 2011

HW2050Plus said:
it will look not that good in pure integer or branch intense.

But Integer is one of the reasons the Core series does so well in alot of apps compared to Phenom. Why would AMD not truly beef those units up? What is it their banking on?

Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

Golden Member

Diamond Member

Senior member

Senior member

Senior member

Diamond Member

Elite Member

Golden Member

Diamond Member

Golden Member

Lifer

Diamond Member

Diamond Member

Senior member

Senior member

Platinum Member

Golden Member

Senior member

Senior member

Diamond Member

Elite Member

Platinum Member

Member

Diamond Member

Diamond Member