Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

Skurge · Mar 1, 2011

Phynaz said:
ATI has contributed less than $100M to AMDs bottom line.

AMD paid $5.4B for ATI.

Yes, it was a bad investment.

That doesn't count the chipsets which are arguably the best on the market, they basically pushed out all the other chipsets (nV, VIA, SIS, etc), not through dirty tactics, they just had better hardware.

Still, this is the 1st I hear AMD accused of too much marketing.

Skurge · Mar 1, 2011

podspi said:
The real question is, for $5billion, could AMD have created Bobcat or Llano w/out ATI? (Or become a viable GPU competitor...) That I don't know...

I doubt that seeing how many other GPU companies fell by the wayside and Intel who have a lot more money still have crappy IGPs

Voo · Mar 1, 2011

podspi said:
The real question is, for $5billion, could AMD have created Bobcat or Llano w/out ATI? (Or become a viable GPU competitor...) That I don't know...

Considering the amount of money Intel is pooring into graphics and is still lacking behind.. doubtful. If there was one forward looking stroke of genius from AMD in the last few years than it was the buying of Ati, since right now absolutely everyone (Intel, Nvidia) is working on combining CPUs with GPUs, which arguably helps if you've got some people who know what they're doing.

Arkadrel · Mar 1, 2011

podspi said:
I don't think they ever expected the ATI investment to be profitable from a strictly financial (if they kept the entities separate) standpoint. ATI + AMD were (are) supposed to be worth more than the sum of their parts because of Fusion.

The real question is, for $5billion, could AMD have created Bobcat or Llano w/out ATI? (Or become a viable GPU competitor...) That I don't know...

Useing 8billion on Mcfee was a better investment?
useing 1billion on AMD, 1.5billion on Nvidia in settlements?
useing 1billion+ because of a sata bug on motherboards?
Hireing that Will I am dude, from blackeyepeas, as a marketing dude? (soon the processers will come in purple? stylised finish?)

Intel have oddways of spending tons of cash, but since they own the OEMs... they make it all back.

By compairison I think AMD buying out ATI was a smart move, they knew the future of computeing might lay in GPGPU, and wanted to have it. I mean thats logic I can understand, I still dont understand why intel would buy Mcfee, cpu-wise, maybe they just want to own a software firm that makes profits?

Considering the amount of money Intel is pooring into graphics and is still lacking behind.. doubtful.

^This... Intel wont have its own version of say a 6990 for.... along time if ever.

hamunaptra · Mar 1, 2011

Ajay said:
From Xbitlabs "new dual-channel DDR3 memory controller". Do we know at this point what the max supported speed will be for the new controller?

While we dont know the OC'able max allowed by the memory controller. I think we do know its officially up to 1866mhz support.

SlowSpyder · Mar 2, 2011

Hey guys, I'm not the most knowledgeable, so go easy on me. 🙂 And I haven't read all 1000+ posts, so I'm not sure if this has been already talked about, my apologies if it has.

But from looking at how the cores are situated in each module, is it possible that single threaded use will get all of the resources of a module? And if that is possible, would it be possible for some kind of on-chip scheduler to assign threads 1-4 to each module as opposed to cores, that way the performance on those threads would be quite good (I presume)? Then to combat Intel's hyperthreading (AMD said they don't like the way HT is implemented and feel they can do better at the silicon level - I've read something along those lines in the past) threads 5-8 would be assigned to the idle cores in the modules, this would lower the IPC somewhat, but Bulldozer could then handle eight threads fairly elegantly.

This would presumably give Bulldozer pretty stout performance in today's games/software that are often threaded for 1-4 cores while still being able to handle up to eight threads.

Just thinking out loud.

Idontcare · Mar 2, 2011

SlowSpyder said:
Hey guys, I'm not the most knowledgeable, so go easy on me. 🙂 And I haven't read all 1000+ posts, so I'm not sure if this has been already talked about, my apologies if it has.

But from looking at how the cores are situated in each module, is it possible that single threaded use will get all of the resources of a module? And if that is possible, would it be possible for some kind of on-chip scheduler to assign threads 1-4 to each module as opposed to cores, that way the performance on those threads would be quite good (I presume)? Then to combat Intel's hyperthreading (AMD said they don't like the way HT is implemented and feel they can do better at the silicon level - I've read something along those lines in the past) threads 5-8 would be assigned to the idle cores in the modules, this would lower the IPC somewhat, but Bulldozer could then handle eight threads fairly elegantly.

This would presumably give Bulldozer pretty stout performance in today's games/software that are often threaded for 1-4 cores while still being able to handle up to eight threads.

Just thinking out loud.

Yes, that is one of the strengths of CMT when the thread-count falls below the core count.

As to the question of whether or not the thread scheduler is module-aware...there is no reason for it to not be.

Thread schedulers are already SMT aware, they are aware of logical versus physical cores...but as we've seen in benchmarks and in-depth performance analyses the schedulers in today's OSes do not appear to leverage this information smartly or intelligently.

For example with an SMT system, hyperthreading, windows does not load physical cores before loading logical cores, it treats them all the same, and so you get funky performance results depending on whether or not you disable HT.

Notice the yellow error bars, these represent the range of performance numbers when running the bench with that many threads, the variance comes from windows sometimes loading logical cores in already loaded physical cores versus randomly getting it right and loading all the physical cores and none of the logical cores:

To be sure the opportunity exists for thread schedulers to be more intelligent about core loading, for both SMT and CMT CPU's, but it remains to be seen whether this ever happens in practice.

SlowSpyder · Mar 2, 2011

Got it. I thought that the threads were, or could be scheudled on the chip somehow. I didn't realize that was an OS function. So really both AMD (at least with Bulldozer) and Intel could gain from better scheduling.

Idontcare · Mar 2, 2011

SlowSpyder said:
Got it. I thought that the threads were, or could be scheudled on the chip somehow. I didn't realize that was an OS function. So really both AMD (at least with Bulldozer) and Intel could gain from better scheduling.

Yeah the CPU guys could hardcode their own thread scheduler transparently to the OS, similar to how wear-leveling algorithms in today's SSD's result in the OS being completely unaware of where the data physically resides on the NAND chip.

But to my knowledge the CPU guys have intentionally left thread-scheduling and thread-management up to the OS.

However in world of turbo-clocking and advanced power gating it makes sense for the hardware guys to leave less of their fate at the hands of Microsoft's update timetable and start managing resources directly.

We see it with thermal management as well as power-consumption management, I suppose we will eventually come to see it with thread-management as well.

Lonbjerg · Mar 2, 2011

Arkadrel said:
Useing 8billion on Mcfee was a better investment?
useing 1billion on AMD, 1.5billion on Nvidia in settlements?
useing 1billion+ because of a sata bug on motherboards?
Hireing that Will I am dude, from blackeyepeas, as a marketing dude? (soon the processers will come in purple? stylised finish?)

Intel have oddways of spending tons of cash, but since they own the OEMs... they make it all back.

By compairison I think AMD buying out ATI was a smart move, they knew the future of computeing might lay in GPGPU, and wanted to have it. I mean thats logic I can understand, I still dont understand why intel would buy Mcfee, cpu-wise, maybe they just want to own a software firm that makes profits?

^This... Intel wont have its own version of say a 6990 for.... along time if ever.

The difference is:
Intel got the money to burn...AMD dosn't.

hamunaptra · Mar 2, 2011

SlowSpyder said:
But from looking at how the cores are situated in each module, is it possible that single threaded use will get all of the resources of a module?

Just to be clear, I want to make sure you arent asking if it will be able to run a single thread through both INT cores simultaneously?
Because, if thats what you are asking then NO...this is not possible and JF actually said this would slow throughput down quite a bit.

OTH if you are referring to just the fact that if there is a single thread running through a module will all of its shared resources be dedicated to that thread? Then YES...this is exactly what happens.

Also, yeah scheduling threads is pretty much up to your OS and how effecient it is at that. I dont know if OS's will be module aware or simply just see all the INT cores as just cores.
I believe JF has stressed a couple times that OS's will NOT be module aware.
But you never know, the linux kernel could be changed so that when it sees a BD CPU it can group cores together into their respective module and schedule tasks according to that?...I am not sure, but Im guessing its a possibility.
MS would have to be willing to release an update for their OS to do the same.

If the above changes are made to the kernels, whats interesting is it could go both ways. The power management setting could be set to "max performance" in which case the module aware OS would strive to schedule threads on separate modules. OR...there could be the "power saving" mode where the OS schedules to group threads together on modules, so the other nonused modules can be power gated. ~ This would be a very AWESOME implementation.

Also, while thread scheduling is mainly up to the OS. AMD has designed the front end of the processor to hopefully be very smart when it comes to dealing with those threads.
The FE is shared, but also every major area of the FE is decoupled from the next. Meaning, from what I have been reading, different parts of the FE can work on different threads, despite what any part in the FE is working on.

SlowSpyder · Mar 2, 2011

Yea, I meant the shared resources, not the cores. With my PhII if I run something that is single threaded, the L3 cache that is shared by all cores can be used by that single threaded app. I didn't know if AMD had a clever way of forcing multi threaded, up to quad threaded (not sure if that's the correct terminology there), apps to use the unused modules before using the second core of a module. I also do remember JFAMD mentioning that the OS will not be module aware, but I wasn't sure if somehow the chip itself would be able to put a bias on every other core somehow. IDC explained pretty well why the OS will not be able to do this, and from the sounds of it AMD hasn't hinted towards the chip being able to work any such magic either. But, I was curious if that was possible or likely, at least I learned something today. 🙂

Voo · Mar 2, 2011

Idontcare said:
For example with an SMT system, hyperthreading, windows does not load physical cores before loading logical cores, it treats them all the same, and so you get funky performance results depending on whether or not you disable HT.

Strange, was that test run with Vista or 7? Because 7 should handle physical and logial cores differently and can distinguish between them. You can use that information yourself if you dig a bit into WMI, so I find the result surprising if it was run on Win7.

Skurge · Mar 2, 2011

SlowSpyder said:
at least I learned something today. 🙂

The CPU section is awesome, isn't it?

maddie · Mar 2, 2011

SlowSpyder said:
Yea, I meant the shared resources, not the cores. With my PhII if I run something that is single threaded, the L3 cache that is shared by all cores can be used by that single threaded app. I didn't know if AMD had a clever way of forcing multi threaded, up to quad threaded (not sure if that's the correct terminology there), apps to use the unused modules before using the second core of a module. I also do remember JFAMD mentioning that the OS will not be module aware, but I wasn't sure if somehow the chip itself would be able to put a bias on every other core somehow. IDC explained pretty well why the OS will not be able to do this, and from the sounds of it AMD hasn't hinted towards the chip being able to work any such magic either. But, I was curious if that was possible or likely, at least I learned something today. 🙂

That is not necessarily the best solution for a turbo active CPU.

Even though you get all the resources of the module for 1 thread, you can gain more throughput by using 1 module for 2 threads and clocking higher.

HW2050Plus · Mar 2, 2011

Idontcare said:
To be sure the opportunity exists for thread schedulers to be more intelligent about core loading, for both SMT and CMT CPU's, but it remains to be seen whether this ever happens in practice.

It's the question what you want. If the scheduler first loads all modules it is best for performance. If it loads it by cores, then it is best for power consumption (because whole modules can be switched off). It would be fine if the user could configure this.

AtenRa · Mar 2, 2011

Dresdenboy have found the following.........VERY interesting

http://www.planet3dnow.de/vbulletin/showpost.php?p=4378267&postcount=365

Significant Upgrade in June 2011 232 AMD 2.3 GHz 16-core Opteron Interlagos processors 3,712 compute cores, 116 32-core nodes 7.4 TB DDR3 memory, 64 GB/node, 2.0 GB/core

In June 2011, a 720-teraflop Cray XE6 system will be added to Gaea. It will employ the next-generation AMD Interlagos 16-core processor. After the installation of that second system, the original 260-teraflop system will be upgraded with the same AMD Interlagos processor to achieve 386 teraflops.

So it seams Desktop Bulldozer will be released before June 😉

nonameo · Mar 2, 2011

AtenRa said:
Dresdenboy have found the following.........VERY interesting

http://www.planet3dnow.de/vbulletin/showpost.php?p=4378267&postcount=365

So it seams Desktop Bulldozer will be released before June 😉

eh? so bulldozer is supposed to come before llano? I thought it was the other way around.

Skurge · Mar 2, 2011

AtenRa said:
Dresdenboy have found the following.........VERY interesting

http://www.planet3dnow.de/vbulletin/showpost.php?p=4378267&postcount=365

So it seams Desktop Bulldozer will be released before June 😉

WTF? can someone tell me why instead of seeing images, I keep getting a stupid yellow frog in an Ice Cube with "Domain Unregistered. To view, register at: bit.ly/imageshack-domain"?

AtenRa · Mar 2, 2011

Skurge said:
WTF? can someone tell me why instead of seeing images, I keep getting a stupid yellow frog in an Ice Cube with "Domain Unregistered. To view, register at: bit.ly/imageshack-domain"?

http://img580.imageshack.us/img580/9878/13063llano3.jpg

try this 😉

busydude · Mar 2, 2011

Skurge said:
WTF? can someone tell me why instead of seeing images, I keep getting a stupid yellow frog in an Ice Cube with "Domain Unregistered. To view, register at: bit.ly/imageshack-domain"?

Because you are in Namibia.. 😛

The image says BD's initial production is going to begin in Apr'11 and Llano's in Jun'11.

BD release date has been atleast a little consistent.. when compared to Llano.

ShadowVVL · Mar 2, 2011

Will they release 4c ,6c,and 8c all at once or will we get 8c now and see 4 and 6 a bit later?

Castiel · Mar 2, 2011

Saw these over on another site. Been posted here?

Arkadrel · Mar 2, 2011

I read that some guys on SA looked at the die shots, and by useing the 30something mm^2 statement AMD made, guessed the 8core bulldozer to be around 294mm^2.

So the 8core will be rather big compaired to say Intels Sandy bridge thats only 216mm^2 or something.

AMD Thuban's Die Size~ 346 mm², which is for the phenom II x6 core cpu.

Doing a 346mm^2 6core --> bulldozer's ~294mm^2 8core, doesnt sound so bad, its still huge when compaired to the 216mm^2 sandy bridges, So Im hopeing they deliver some nice performance.

itsmydamnation · Mar 2, 2011

with the big difference being an extra 8mb of cache. i would expect in the future for amd to release a version with either reduced L2/L3 (8 meg total) or no L3 at all call it an Athlon III? that should be much closer in size to a 4 core 32nm SB

Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

Diamond Member

Diamond Member

Golden Member

Diamond Member

Senior member

Lifer

Elite Member

Lifer

Elite Member

Diamond Member

Senior member

Lifer

Golden Member

Diamond Member

Diamond Member

Member

Lifer

Diamond Member

Diamond Member

Lifer

Diamond Member

Senior member

Golden Member

Diamond Member

Diamond Member