Some Bulldozer and Bobcat articles have sprung up

Scali · Aug 27, 2010

bryanW1995 said:
if jvroig and idontcare are so knowledgeable, then why aren't they castigating amd the same way you are?

Because we have different personalities.
They understand where I'm coming from and respect my choices and actions though.

bryanW1995 said:
Please quit derailing the thread and stay on topic.

You know, if you and the rest of the AMD camp wouldn't be so touchy all the time, and wouldn't have to reply (en masse) to everything I post, we wouldn't be having this situation in the first place.

I don't approve of any kind of company employees taking part in forum discussions in the first place. If they want to relay information, they'll just have to use the proper channels: put up some press releases on your own websites, send some presskits around, perhaps have some conference calls etc...
But if you come here as some kind of PR spindoctor and try to have a discussion with engineers such as myself, you're in trouble. We don't just buy what you say, we can make our own minds up with the information presented, combined with our own knowledge an experience. And when you try to argue, you're at a disadvantage, because you're the PR guy, we're the techies. You need to go back and consult with your engineers first, and try to figure out what we're saying, and what your answer should be. Not a situation you'd want to be in.

Idontcare · Aug 27, 2010

heyheybooboo said:
My understanding is the 'Ontario' platform is in the 18w range with a BGA only socket. There have been some articles surrounding this in the last few months.

That said, it would not prevent someone like nV from packaging a chipset and 'rolling their own' with a Bobcat APU.

BUT the issue is the integrated- integrated GPU (did that make sense? - LOL) on Bobcat. What more could nV bring to the table by providing their own graphics power to the platform if AMD uses a cut-down, low-power Radeon 5xxx on the die (with DX11, xxx shaders, combined with TrueHD and DTS-HD, UVD, etc. ...) ?

The additional graphics from nV would drive up the wattage and that defeats the purpose of the low-power platform, right?

I think you just cut right to the quick, sir.

With Intel's approach to market release of Atom they left the door open for someone with a better performance/watt solution to come in and take care of the rest of the stuff in the system that wasn't atom. The chipset being built on ancient 130nm node tech, the GPU being an Intel GPU, etc.

So Intel, by virtue of their own plans and strategy, created the opportunity for Nvidia to have a compelling value-add story for the market.

AMD has set themselves up to basically gobble up all the value-add opportunities for themselves by virtue of Bobcat really being SoC. What's left? Maybe a supporting chipset to handle south-bridge type things like USB and so on? Unless they build that thing on 1um process tech it isn't going to represent a huge power-consumer so no real opportunity for second-party solution to make a value-add story for the market.

Also - can anyone clarify the deal with Bobcat versus Ontario? Bobcat is just the CPU portion of the Ontario chip, right? So we wouldn't call Bobcat the APU because Bobcat does not include the integrated GPU aspects of the SoC...so we would talk about the compute performance of Bobcat if we are talking x86 stuff but if we want to talk about the product, all that the monolithic chip itself brings to the table, are we supposed to refer to that as Ontario? I'm quite confused at this stage.

Scali · Aug 27, 2010

Cogman said:
I think most here really want to see AMD come out with some spectacular processor that completely thrashes all of intel's offerings. We realize that strong competition is good for the consumer. Alas, we also realize that the likelihood of that is slim to none.

I think the problem is that some people here want it a bit TOO badly, and get carried away.

Martimus · Aug 27, 2010

Idontcare said:
Also - can anyone clarify the deal with Bobcat versus Ontario? Bobcat is just the CPU portion of the Ontario chip, right? So we wouldn't call Bobcat the APU because Bobcat does not include the integrated GPU aspects of the SoC...so we would talk about the compute performance of Bobcat if we are talking x86 stuff but if we want to talk about the product, all that the monolithic chip itself brings to the table, are we supposed to refer to that as Ontario? I'm quite confused at this stage.

That is the way I understand it. Ontario is the platform, while Bobcat is the CPU core architecture.

I wonder if AMD will come out with a second generation Bobcat that is optimized by hand, instead of being designed completely by a program as Bobcat is said to be? It is that second part that keep sme from being too excited about this release, as I wonder how truly efficient the design can be.

Idontcare · Aug 27, 2010

[Guys, I'm not posting as a mod, I feel the need to say this because I don't want my comments blown out of proportion or to taken has having any more weight of opinion than other of my fellow posters here.]

Can we just take a bit of a breather here? We are an enthusiastic bunch, all of us, we clearly prioritize spending a non-zero percentage of our daily lives reading and writing to each other in these forums.

Being here clearly ranks as a priority for each of us, we are doing this at the expense of not doing other things in our lives...presumably because being here provides us with a level of enjoyment and a degree of satisfaction that other hobbies simply fail to provide.

If that is not the case then one should definitely re-evaluate why they are bothering to come here, we don't want people feeling miserable about themselves or about others, misery != fun.

So let's not overlook our commonalities, the fact that the very reason each of us finds ourselves here in this thread posting to one another is because we all share something in common. That deserves respect in its own right.

Scali said:
There's plenty of other companies you know.
I'm a bit of a hard-core guy myself, so the DEC Alpha has always been high on my list.

OMG you had to bring up DEC didn't you

I had the closest thing one could consider to be as an engineer-crush on DEC alpha's for the better part of a decade. I was literally in-love with that technology.

Other kids pinched pennies and saved for beer and pizzas and girls and I pinched pennies and saved to buy a 21164PC.

Idontcare · Aug 27, 2010

Martimus said:
That is the way I understand it. Ontario is the platform, while Bobcat is the CPU core architecture.

I wonder if AMD will come out with a second generation Bobcat that is optimized by hand, instead of being designed completely by a program as Bobcat is said to be? It is that second part that keep sme from being too excited about this release, as I wonder how truly efficient the design can be.

OK, I'm just going to throw this one out there as a completely wild-ass, no valid reasoning whatsoever, theory/forecast: what if bobcat, the architecture, turns out to be AMD's Banias?

Not trying to make the parallels that Bulldozer becomes AMD's netburst so they need bobcat to save them come 16nm...but look at the big big big picture and scale out the timeline to 3 nodes from now (well 3 nodes from when BD is introduced, so 32nm -> 22nm -> 16nm)...

Unlike Atom, Bobcat has all the fancy shmancy (yes that is a technical engineering term) microarchitectural features of a very modern very advanced processor. OOO, register rename, yaddi yadda.

So go out a few nodes and ask yourself, when power-consumption becomes all the more problematic for AMD (just as 90nm and 65nm did for Intel) would you really be at all surprised if someone in the executive team reaches down into the low-power design group and says "hey, you think you could take that already super-low power processor and maybe scale it up just a tad?".

To me it is handwriting on the wall. Could be I just have too much hope and I want to believe just a little too much

(just kidding with you Scali

you made a good point up there, but don't forget that not everyone wants to live in a fantasy-free world, sometimes it can be fun to dream, don't feel like its your job to bring every dreamer here back to reality in the forum, please let a few of us be blissfully ignorant every now and then)

Scali · Aug 27, 2010

Idontcare said:
OMG you had to bring up DEC didn't you

See... now there's someone who understands where I'm coming from.
All you people who claimed I didn't know what I'm talking about, and are now searching Wikipedia on what a DEC Alpha is: shame on you!

Idontcare said:
To me it is handwriting on the wall. Could be I just have too much hope and I want to believe just a little too much (just kidding with you Scali

Well, let the record show that I didn't say anything about Bobcat at all.
That's because I pretty much agree with what you and various others have said: Atom isn't all that great, and it leaves some 'low-hanging fruit' like ooo.
Atom also needs nVidia's Ion to really be ready for things like HD playback... so if AMD can fit a nice GPU on the die as well, there's some great opportunities for multimedia netbooks and things like that.
I know I looked at Atom for my home server, and decided against it. I chose a regular Pentium DualCore instead, simply because the power/performance ratio was much better. The performance of Atom just isn't very good... and those crappy powerhungry 945 chipsets that you find on most boards/barebones don't exactly help in the power consumption department.

jvroig · Aug 27, 2010

With Idontcare back in the thread injecting a fair bit of logic, common sense and goodwill, the signal-to-noise ratio seems to go back up, and perhaps there is hope for this thread yet. After all, it is only 200+ posts yet. Last time we talked about Bulldozer, the thread ended at nearly 500 posts. It was a damn good informative thread. Maybe this one will be yet.

Scali said:
Well, let the record show that I didn't say anything about Bobcat at all.

Bobcat / Ontario is probably the one thing here everybody can agree on. Bobcat is very promising, and there seems to be little need for much marketing brouhaha over it. I consider it a good sign. 5W OoO CPU, and not saddled with a chipset whose graphics capabilities are overtaken by a laser printer? (a joke I picked up here, don't remember where) Please, sign me up. I'll gladly throw away my current 10" netbook and get a slightly bigger one (11.6"), powered by Ontario.

extra · Aug 27, 2010

Scali said:
and then I demand that he quits his job, or that AMD fires him, for deliberate false advertisement of AMD's products (which is against the law in most countries).

Can you please take a breather from this thread? It was very enjoyable before your rants. Thanks.

Yeah, Idontcare...I had a 21164 back in the 90's as well...that thing was great. Was 533mhz, back when the top x86 was what, 266 or 300 or something. Ran Windows NT and Linux...man, Linux really hauled on that thing. And the emulator that let you run x86 apps on it, and then everytime you ran them they got faster--I was actually able to play some games on it (although they were a bit choppy). I seem to remember playing half life on it actually, but maybe I just did too many drugs. I miss that thing! I always wanted a 21264 but never saw one. I'd love to get my hands on an old Alpha chip just to have as a talking piece, even if it didn't work.

Anyway...

DrMrLordX · Aug 27, 2010

Idontcare said:
OMG you had to bring up DEC didn't you

I had the closest thing one could consider to be as an engineer-crush on DEC alpha's for the better part of a decade. I was literally in-love with that technology.

Other kids pinched pennies and saved for beer and pizzas and girls and I pinched pennies and saved to buy a 21164PC.

I almost had an Alpha. Then I took a look at the platform cost and the NT4 license and . . . nyarr. Got a k6-233 instead and wasted that money on the storage subsystem (UW SCSI, aw yeah) and RAM (128 megs SDRAM, aw yeah) instead.

In retrospect, maybe I should have gone with the Alpha, though I did really dig that k6.

Idontcare said:
OK, I'm just going to throw this one out there as a completely wild-ass, no valid reasoning whatsoever, theory/forecast: what if bobcat, the architecture, turns out to be AMD's Banias?

An interesting thought, but from what I've read, Bobcat is looking an awful lot like a cut-down Stars processor. If AMD thought they could compete with Sandy Bridge et al by taking K10.5, moving it to 32nm, widening the execution unit and tacking on the prefetcher they are using for bulldozer, I suspect they'd be doing it already. If you think about it, that's essentially what would happen if AMD goes "full circle", realizes that bulldozer is too exotic/too buggy to really challenge Intel in real-world scenarios, and then looks to Bobcat to save the day; some AMD guy would say to himself, "Gee, if only we could retire more instructions per clock with this low-power k10 chip we're pimping, maybe we could do something with it if we can get the clockspeed up high enough and cram enough cores onto one die/package like we did with Magny-Cours".

Who knows, maybe they wouldn't even have to do that. If Bobcat has a small enough die size, maybe they could go mega-multicore with it and introduce their own speculative threading technology to better leverage all those cores. But that's just out in left field, that is.

Martimus · Aug 27, 2010

Idontcare said:
OK, I'm just going to throw this one out there as a completely wild-ass, no valid reasoning whatsoever, theory/forecast: what if bobcat, the architecture, turns out to be AMD's Banias?

Not trying to make the parallels that Bulldozer becomes AMD's netburst so they need bobcat to save them come 16nm...but look at the big big big picture and scale out the timeline to 3 nodes from now (well 3 nodes from when BD is introduced, so 32nm -> 22nm -> 16nm)...

Unlike Atom, Bobcat has all the fancy shmancy (yes that is a technical engineering term) microarchitectural features of a very modern very advanced processor. OOO, register rename, yaddi yadda.

So go out a few nodes and ask yourself, when power-consumption becomes all the more problematic for AMD (just as 90nm and 65nm did for Intel) would you really be at all surprised if someone in the executive team reaches down into the low-power design group and says "hey, you think you could take that already super-low power processor and maybe scale it up just a tad?".

To me it is handwriting on the wall. Could be I just have too much hope and I want to believe just a little too much (just kidding with you Scali you made a good point up there, but don't forget that not everyone wants to live in a fantasy-free world, sometimes it can be fun to dream, don't feel like its your job to bring every dreamer here back to reality in the forum, please let a few of us be blissfully ignorant every now and then)

I can see the parallels here, and not just on the Banias side. I also see the parallels between Bulldozer and Netburst. Bulldozer has an increased pipeline, with a more aggressive prefetcher and branch logic prediction, things that Intel really pushed with Netburst. Even if the architecture fails, it will give AMD the experience needed to improve their branch prediction in future architectures, so it won't be a worthless experience for them.

One thing I don't understand is why Intel seems to have both a faster cache system, along with a more compact cache system to AMD. (I am bringing this up, as this is the other major advantage Intel has, which I had attributed to the fact that they needed better cache to compete with AMD's IMC, but thinking about it now, AMD should have been able to catch up by now.) Nehalem cache is faster (Nehalem (2.66GHz) L1 Cache (4 cycles) L2 Cache (11 cycles) L3 Cache (39 cycles) vs. AMD Phenom II X4 920 (2.80GHz) L1 Cache (3 cycles) L2 Cache (15 cycles) L3 Cache (unknown)), and yet 24% smaller than Shanghai cache. (I found here that L3 Cache measured to 38 cycles on Deneb, which is nearly equal to Nehalem, yet it is far less dense.) I have no experience in how Cache works, but I wonder why Intel seems to be able to pack so much more cache into the same area, even at approximately the same speed. There has to be something I am missing here.

Back to Bobcat, I can easily see how AMDs development on that core will effect developement on their other architectures, if for no other reason than it will force them to find solutions to problems they previously could put off, since the problem wasn't as important on the Server/desktop market. They can then use these solutions in future server/desktop architectures, even if they don't derive the overall architecture from Bobcat.

heyheybooboo · Aug 27, 2010

Idontcare said:
....
Also - can anyone clarify the deal with Bobcat versus Ontario? Bobcat is just the CPU portion of the Ontario chip, right? So we wouldn't call Bobcat the APU because Bobcat does not include the integrated GPU aspects of the SoC...so we would talk about the compute performance of Bobcat if we are talking x86 stuff but if we want to talk about the product, all that the monolithic chip itself brings to the table, are we supposed to refer to that as Ontario? I'm quite confused at this stage.

Shoot, I got no clue ... we need a flash card with all the players.

I think it would be correct as Bobcat cores (as the overall arch being Fusion APUs ??).

The 'Liano' platform is yer basic notebooks/entry-level desktops --- the 'Ontario' platform is the low-power ultra-portable/netbook action.

BUT (you knew that was coming, right?) ...

The individual platforms are constantly referenced as the Liano APUs or Ontario APUs. And wouldn't you know it? Bobcat APUs

Idontcare said:
OK, I'm just going to throw this one out there as a completely wild-ass, no valid reasoning whatsoever, theory/forecast: what if bobcat, the architecture, turns out to be AMD's Banias?
...

Gawd forbid it happens but it looks doubtful from what has been reported so far (and discounting the rantings and ravings herein).

AMD for the most part has delivered 'on-time and spec' since B2 Barcelona and the TLB.

The sb8xx may have been bumped a bit but I suspect that was as much market driven as any thing else. We are all still waiting for 'real' USB3/SATA 6Gb/s product to hit the shelves.

Where AMD might suck (salmonella?) eggs is Zambezi ...

If posting this 'roadmap' the end of last year, and then 'back-tracking' by saying, "Uhhh, well, Zambezi will only work in AM3+, not AM3 as was posted on our Roadmap ... " nine months later, they gots some 'splaining to do.

And they should come right out and say it. Now. Publicly. In a major way. You hear us, AMD?

Where's that !*&%$?! sum beach JF? Let's fire his arse! :biggrin:

--

Idontcare · Aug 27, 2010

Martimus said:
One thing I don't understand is why Intel seems to have both a faster cache system, along with a more compact cache system to AMD. (I am bringing this up, as this is the other major advantage Intel has, which I had attributed to the fact that they needed better cache to compete with AMD's IMC, but thinking about it now, AMD should have eben able to catch up by now.) Nehalem cache is faster (...snip...), and yet 24% smaller than Shanghai cache. (I found here that L3 Cache measured to 38 cycles on Deneb, which is nearly equal to Nehalem, yet it is far less dense.) I have no experience in how Cache works, but I wonder why Intel seems to be able to pack so much more cache into the same area, even at approximately the same speed. There has to be something I am missing here.

two things - (1) process technology, and (2) sram cell design (including DFM - design for manufacturability).

For Intel their process tech simply enables a level of performance that AMD can't beat because AMD's sram design team doesn't have access to Intel's process tech.

The second part is more tricky - designing sram for manufacturability. Its not simply about being the smallest or the fastest, you have additional design constraints relating to tolerance for voltage shifts, yield entitlement, etc.

Sram design is also an iterative optimization process. That means resources determine how far down the optimization path you can go. Give me $100m to design an sram cell for 22nm node and you'll get a far higher performing sram design than if you give me $25m and tell me to attempt to accomplish the same.

Having been intimately involved in this exact aspect of the industry I can tell you from a professional perspective I do not marvel at what Intel achieves, I marvel at what the others manage to accomplish with their vastly reduced budgets and resources.

And I have worked side-by-side with a lot of ex-Intel engineers and the functional reality of this resource gap really can't be overstated.

Give me $500B and set the clock back to 1960 and you shouldn't be surprised that I put a man on the moon for you within a decade. Give me $50B and ask the same you should be pleasantly surprised if I manage to pull it off without racking up a sizable body-count.

Intel gets my respect for the degree of organization and sophistication with which they apply the scientific method as a business method when it comes to pathfinding and research allocation. It can't be touched.

Idontcare · Aug 27, 2010

heyheybooboo said:
Where's that !*&%$?! sum beach JF? Let's fire his arse! :biggrin:
--

LOL! you just made my day!

I'm logging out and calling it day, time to destroy some margaritas, friday was just /'ed by HHBB and it's only 1:30.

bryanW1995 · Aug 27, 2010

scali said:
There's plenty of other companies you know.
I'm a bit of a hard-core guy myself, so the DEC Alpha has always been high on my list.

a buddy of mine had an alpha processor back in the day. he was in high school, this cpu represented a significant portion of his yearly income. when he was building his system, he dropped it on the floor and was never able to get it to work. too bad he didn't drink back in those days...

jvroig · Aug 27, 2010

AtenRa said:
In a Phenom II design (Deneb) you need the front end, execution unit, L2 etc. In Bulldozer you only need the Integer Core (12% more in the Module).

First mistake, If you add 4 Deneb cores in a Phenom II Quad CPU you don’t get an 8 Core Bulldozer CPU but an 8 Core Deneb CPU. (8 Integer Execution Units + 8 FP Execution Units)

If you add 4 Deneb cores to a Quad Core Phenom II then you add 50% more die because you take 4 times the Front End and the L2 Cache. In Bulldozer you don’t add the front end, nor the L2 thought you don’t add 50%. Go it ???

Hope I was able to help

Alrighty, let's give this another shot. This isn't just for you, in case you start feeling I'm singling you out, but I'm quoting you since you've quoted me

[Warning for all - long wall of text incoming. SKIP if no interest in the Bulldozer 5% issue]

To set the context, let's have an example everybody here understands - broadband internet.

Say you've got a 2mbps line, and a carrier is trying to sell you a 3mbps line. Sounds good. They give you a presentation, and their pamphlet or marketing slides or website tell you "2.5mbps download speed, .5mbps upload". Sounds good still. That totally kicks the ass of your current 2mbps line, right?

So, are you gonna bite? Of course not yet. They are quoting a useless metric, most likely, because all they advertise is the burst speed, and in real life you will rarely get that much. What you need is the committed information rate.

So what you do is compare the CIR of your current 2mbps line against the CIR of the new offering, and if you discover you have much better CIR, then "upgrading" is obviously uncalled for.

Of course, consumers who are unaware of such details will be puzzled when you tell them you chose a 2mbps line from provider A instead of a 3mbps line from provider B. They have no idea what burst speed and CIR are. All they see are what's in the marketing materials - 2mbps VS 3mbps. To be clear, nobody is lying to anybody - it's just that the new provider (offering 3mbps) isn't really telling you the whole picture, and instead quoting big numbers (that are factual, but pretty much meaningless) to sell you something. It just so happened that maybe, the numbers not quoted are actually not in your favor.

That is what's happening here when the 5% - 12% - 50% figure issue exploded. I don't see it as malice on anyone's part that they insist on 5, 12, or 50. Rather, it's a difference in the understanding of the topic based on someone's depth of knowledge on the issue - much like those aware of CIR and burst speed reaching a different conclusion (or, even if the conclusion was the same, the thought process would have been different) from those consumers who just don't really know the intricacies of broadband connections.

Now that the context is settled, let's get this out of the way:

khon said:
An 8-core BD could well be smaller than a 4-core SB, since adding the four secondary cores apparently only increases size by 5%

This is pretty much what started it as far as I was involved with in the discussion, and there is more here than meets the eye, none of which is the fault of khon (I wanted to make this clear, so that no ill-feelings will be harbored by khon).

Khon and, I am sure, a lot of other people saw the presentation, saw the quoted figures, and now have the idea planted in their mind that goes something like this: "Wow! More cores for just 5% more space!". Let's study the quote from Khon again (I can quote others as well, from here and other forums, but let's stick to what we've already got):

khon said:
An 8-core BD could well be smaller than a 4-core SB, since adding the four secondary cores apparently only increases size by 5%

Emphasis mine, of course. So AMD made a presentation, without lying (facts presented are what we call "true facts"

), but in such a way that would make a totally different impression. AMD makes a statement: int cores are just 5% of the die size (true), module wise the int cores themselves take about 12%, no biggie. And now, after that, people are: Wow! What a breakthrough in design and engineering! Extra cores only take 5%! Damn, this is gonna be so much smaller than what they would have accomplished in a Deneb-shrink, or what Intel is going to get with SB!

See, AMD actually never mentioned any of that, but it was sort of implied, right? They didn't claim anything of the sort (only more perf/watt or perf/size or some other metric that has ALWAYS been touted - happened in Shanghai, happened in Istanbul, etc), but most people reading it just interprets it a wee bit differently, in a more favorable light.

Was AMD lying to us? Not really. Because 5% IS the size of an int core, relative to the whole die. Of course, this is just an int core, not really "one whole CPU core" the way we understand CPU cores today (which is not how it has always been, and probably not how it will always be - for the moment, if you do not understand that or why, let's leave it be and get back to it later). So, is this "5% int core" really amazing?

Let's look at Gulftown:

This is Gulftown's die map, naturally. Now, let's pretend I am a janitor for Intel, and on the day of Hot Chips, the guy supposed to make the Intel Gulftown presentation got a case of the runs. Hey, it happens. So, being the only good-looking janitor he's ever seen in Santa Clara, he asks me to make a presentation for him, to highlight how awesome our design, engineering, and industry-leading process tech is. Naturally, since I would rather not be cleaning up the clogged lavatories again, I accept the challenge.

So on the presentation, I establish how awesome our design is (note: from here on, I will mimick the AMD "style", but I am in no way mocking AMD - it is simply a necessity for educational purposes). And I start with the die map above, and then follow it up with the next 2 pictures - the same thing but with overlays:

"Hi guys, welcome to Hot Chips, and here I'd like to show you something that's really gonna be a hot chip! This is gulftown, and this will boggle your mind! From the die map earlier, here I've traced out one core, so we can see just how awesome our design is. The red rectangle is exactly the size of a single core. Let's see what happens if we overlay more of those rectangles across most of the die - next slide:"

"I've alternated the colors so the overlay blocks are easier to distinguish from each other. As you can see, we already have 10 of these blocks, and it is not quite the entire die yet! We can estimate then that the cores only take up about 6-7% of the die size! Once again, Intel's industry-leading process tech has made magic! This is totally why we are completely innocent of that whole Dell brouhaha, I mean come on, if we are this awesome, why would we bother paying off the bastards at Dell, amirite?"

Ok, end the conference, back to the real world.
Let's review the "AMD computation":

JFAMD said:
Here is the math. Start with a full die. Remove 1 integer core from each module and 1 integer schedule (everything else stays where it is). Measure the die size. Your new number is 95% of the total die.

Ok, let's do that for Gulftown, based on the pretend presentation I made earlier:
Here is the math. Start with a full Gulftown die. Remove 1 whole core, like how I shaded it in the presentation (everything else stays where it is). Measure the die size. Your new number is 93-94% of the total die.

See what I did there? (And note that a Gulftown core even has its own FPU, it's not just an "int core") Like AMD several months ago, AMD at the conference, and AMD now through JFAMD, I am also not lying. But it just doesn't mean anything, and it certainly doesn't mean that Intel only needs 7% die area to keep on adding more cores. What it does do, however, is imply awesomeness, design ingenuity, breath-taking innovation, and makes you think something like this (obviously, since we are only concerned about % of die sizes, let's ignore one is 32nm while one is 45nm):

adapting a quote from khon said:
An 8-core Gulftown could well be smaller than a 6-core Thuban, since adding the cores apparently only increases size by 6-7%

But it is certainly not true, because you just don't add cores to gulftown, right? There are other parts that are not the core (which is where "uncore" came from), and increasing cores but not the uncore means performance (per core) will not remain the same (which is a bad thing), even though it may increase multi-threaded throughput as more cores are involved.

I suppose by now we should get back to Bulldozer, to end it all. And let's get back to the quote from Aten Ra:

AtenRa said:
In a Phenom II design (Deneb) you need the front end, execution unit, L2 etc. In Bulldozer you only need the Integer Core (12% more in the Module).

See, this is where it gets muddy. You only need "12%" more, because AMD did the fuzzy math thing that I emulated a while ago (start with a full die, remove a core, etc)

The purpose of "pop quiz #1" was to establish that FROM A SINGLE CORE, making it a dual core means doubling everything, because it isn't just an int core you need. But when a "Bulldozer" is presented, and then the "lesser silicon" idea is hammered, it is easy to forget this, especially if you aren't really a CPU guy.

Let's look at it this way: if there was only a single core, would AMD have "invented" the module? Obviously not, right? In fact, the very purpose of the module is so that a monolithic dual core comes into existence. So "removing one int core" is actually pretty senseless - because if you remove the int core, then you also remove a lot of circuitry that is otherwise wasted. Why would you need a complex FPU that can serve two cores? Nope, don't need it, stick a normal FPU on it. Same with everything else - all are big enough and designed to be complex such as to allow servicing two int cores, so they all gotta go to.

So you can't really say: "Bulldozer, you only need to add an int core", because that's actually not true. They started with two full cores (this is probably the 6th time I have to repeat it, but right now I actually don't mind at all), die size is 100% for this normal dual core. They got the idea for "multi-threaded" efficiency, so from a normal dual core, they tweak the design until they fused the two cores together (and in the process, there was more or less a radical change in the design, such that now there are naked int cores), sharing resources such that they ended up reducing the size of the dual core. If the dual core was 100%, now they have ~75%, which means it only took them 50% of what would have been needed to go from single core to dual core. To illustrate:

1 core = 100% size
2 cores = 200% size (normal dual core)
2 BD-style cores (i.e., 1 "module") = 150%
Total space savings thanks to going the "module" design route: 50%
Therefore, cost of an additional BD core: only 50%, compared to 100% for the normal way of adding cores.

Of course, that's when you look at it from scratch, which is the right perspective. You see, if you start with the module being there already and you just want to add a core, you will think that it only took 12% more from the core you already had. But, if you had a module in the first place, then you already consumed more space than a normal single core. And if you already had a module but only had a single core, that means you wasted precious space (and design and engineering time) to design something for multithreaded efficiency that only had one core. It just doesn't make sense, exactly because it is the wrong perspective.

I don't know how to make it any more clearer, so let's recap the "old way" of making more cores:

Old Way said:
Start with 1 core.
Core size is 100%, for reference purposes only.
Add another core to make it a dual core
Core size is now 200%.
Add another two cores, to make it a quad core
Core size is now 400%

And here is AMD's "new way", using their architected modules

AMD "Bulldozer Module" Way said:
Start with 1 core.
Core size is 100%, for reference purposes only.
Add another core to make it a dual core, but "modularize" it to become a Bulldozer
Core size is now 150%
Add another two cores to make it a quad core Bulldozer
Core size is now 300%

Clearly, there is significant die space savings. This time, core size grew only up to 300%, whereas for the old way it would have gone to 400%.

Let's get back to pop quiz #2 as answered by AtenRa:

AtenRa said:
First mistake, If you add 4 Deneb cores in a Phenom II Quad CPU you don’t get an 8 Core Bulldozer CPU but an 8 Core Deneb CPU. (8 Integer Execution Units + 8 FP Execution Units)

If you add 4 Deneb cores to a Quad Core Phenom II then you add 50% more die because you take 4 times the Front End and the L2 Cache. In Bulldozer you don’t add the front end, nor the L2 thought you don’t add 50%. Go it ???

See, it's no mistake at all. That pop quiz was to gauge the understanding of how CPUs are designed and made. If you have 4 cores, and you want to make it an octo-core, you will either have to increase the core size by 100% (old way), or 50% (Bulldozer style).

So it doesn't take AMD "5%" or "12%" to add cores. The number is more like 50%, accounting for everything else needed to make a module, relative to what you would have needed if you only had a single core to start with (think about it: would a module exist at all if there was only a single core?). So from 100% size of a single core, the dual core would then be 150%.

Where did we get this 50%? From AMD themselves, when they corrected Anand, because Anand thought from 1 core, you need 5% more and you get a dual core. Which is not true, and if you've gotten this far into my post, you probably understand now that from 1 core, you need 50% more to make a dual core - exactly what AMD told Anand as a correction.

And, of course, after consuming that 50%, you now have a module, a "Bulldozer", in which an individual int core takes up 12% of its size (but quite naturally doesn't mean it only took 12% to make a dual core).

There we go.

In the closing words of AtenRa:

Hope I was able to help

Martimus · Aug 27, 2010

heyheybooboo said:
Where AMD might suck (salmonella?) eggs is Zambezi ...

If posting this 'roadmap' the end of last year, and then 'back-tracking' by saying, "Uhhh, well, Zambezi will only work in AM3+, not AM3 as was posted on our Roadmap ... " nine months later, they gots some 'splaining to do.

And they should come right out and say it. Now. Publicly. In a major way. You hear us, AMD?

Where's that !*&%$?! sum beach JF? Let's fire his arse! :biggrin:

--

Here is a quote from AMD from this site I haven't heard of:

"The existing G34 and C32 server infrastructure will support the new Bulldozer-based server products. In order for AMDs desktop offering to fully leverage the capabilities of Bulldozer, an enhanced AM3+ socket will be introduced that supports Bulldozer and is backward-compatible with our existing AM3 CPU offerings."

"When we initially set out on the path to Bulldozer we were hoping for AM3 compatibility, but further along the process we realized that we had a choice to make based on some of the features that we wanted to bring with Bulldozer. We could either provide AM3 support and lose some of the capabilities of the new Bulldozer architecture or, we could choose the AM3+ socket which would allow the Bulldozer-base Zambezi to have greater performance and capability.

The majority of the computer buying public will not upgrade their processors, but enthusiasts do. When we did the analysis it was clear that the customers who were most likely to upgrade an AM3 motherboard to a Bulldozer would want the features and capability that would only be delivered in the new AM3+ sockets. A classic Catch-22.

Why not do both you ask? Just make a second model that only works in AM3? First, because that would greatly increase the cost and infrastructure of bringing the product to market, which would drive up the cost of the product (for both AMD and its partners). Secondly, adding an additional product would double the time involved in many of the development steps.

So in the end, delivering an AM3 capability would bring you a less featured product that was more expensive and later to market. Instead we chose the path of the AM3+ socket, which is a path that we hope will bring you a better priced product, with greater performance and more features - on time.

When we looked at the market for AM3 upgrades, it was clear that the folks most interested in an AM3-based product were the enthusiasts. This is one set of customers that we know are not willing to settle for second best when it comes to performance, so we definitely needed to ensure that our new architecture would meet their demanding needs, for both high performance and overclockability. We believe they will see that in AM3+."

Here, they seem to be saying exactly what you are asking them to tell you. Although, I cannot read German (at least not fluently enough to understand most of what they are saying), and I cannot tell who the quote is from.

Martimus · Aug 27, 2010

jvroig said:
Where did we get this 50%? From AMD themselves, when they corrected Anand, because Anand thought from 1 core, you need 5% more and you get a dual core. Which is not true, and if you've gotten this far into my post, you probably understand now that from 1 core, you need 50% more to make a dual core - exactly what AMD told Anand as a correction.

Can you link this change, please (or tell me what page the change is on)? I guess I only read the first iteration of the article, and don't know what you are talking about.

jvroig · Aug 27, 2010

Martimus said:
Can you link this change, please (or tell me what page the change is on)? I guess I only read the first iteration of the article, and don't know what you are talking about.

Of course. I've linked to it already on this thread, but the way the thread exploded, I'm sure nobody read every single post. Here it is: http://www.anandtech.com/show/2881

Scroll down, bottom part.

heyheybooboo · Aug 27, 2010

Martimus said:
Here is a quote from AMD from this site I haven't heard of:

Here, they seem to be saying exactly what you are asking them to tell you. Although, I cannot read German (at least not fluently enough to understand most of what they are saying), and I cannot tell who the quote is from.

Was that so hard?

Now ... it's my turn:

To those to whom I suggested that Zambezi would "drop in" AM3 and would be compatible with a BIOS update, I was wrong.

Furthermore, as an offer of good faith as compensation for my error, upon notice of time and place, I will fervently make love to your wife/girlfriend to her satisfaction.

(No fat chicks.)

:biggrin:

NOW .... where are those AM3+ motherboards!

--

Idontcare · Aug 27, 2010

jvroig said:
Once again, Intel's industry-leading process tech has made magic! This is totally why we are completely innocent of that whole Dell brouhaha, I mean come on, if we are this awesome, why would we bother paying off the bastards at Dell, amirite?"

OK, that was just friggen awesome laugh-out-loud hilarity in writing right there.

There is no way I'm going to believe english is not your first language, you've totally mastered the art of cultural nuance.

jvroig said:
What it does do, however, is imply awesomeness

that is pure sig-worthy goodness right there. You are on fire today

Thanks for the detailed post, presentation style aside, the information quite clearly lays out why we should care and why we should not care about the numbers being bandied about.

It also quite clearly details where they come from and how they are computed.

That post really should be a sticky because we are going to see this same topic recycled and debated again and again for the next 12 months.

richierich1212 · Aug 27, 2010

thanks for the laughs and great info, jv

Vesku · Aug 28, 2010

Definitely a funny post Jvroig. I would have just said that with Bulldozer AMD has decided to divide the integer part of a core into 2 slightly smaller cores, making the integer part larger overall, because they think the extra die space needed for managing the extra core is worth it. But because they put out some size percentages before everyone was familiar with the new design there was some confusion.

Not very funny though.

jvroig · Aug 28, 2010

Vesku said:
[snip]
...that with Bulldozer AMD has decided to divide the integer part of a core into 2 slightly smaller cores, making the integer part larger overall...
[snip]
Not very funny though.

Yes, not very funny, but also not very right

That's not what they did at all, especially because they started with two (2, as in not 1) full cores (full, meaning complete, regular, non-mini cores) and then tightly coupled them so as to become monolithic. I would repeat myself for the 6th or 7th time if I explain it here again, and I'm not sure it will actually do any good, but here goes:

jvroig said:
So you can't really say: "Bulldozer, you only need to add an int core", because that's actually not true. They started with two full cores (this is probably the 6th time I have to repeat it, but right now I actually don't mind at all), die size is 100% for this normal dual core. They got the idea for "multi-threaded" efficiency, so from a normal dual core, they tweak the design until they fused the two cores together (and in the process, there was more or less a radical change in the design, such that now there are naked int cores), sharing resources such that they ended up reducing the size of the dual core. If the dual core was 100%, now they have ~75%, which means it only took them 50% of what would have been needed to go from single core to dual core.

It is really not that far from how Bulldozer is commonly described by AMD (source: from AMD slides, some paraphrasing / combination for clarity) :
- "Two (2) tightly linked cores"
-"Start with 2 cores (fully capable core performance level), share hardware when needed, and invest in shared bandwidth/capacity, aggressive data prefetch, etc, to make the fusion a reality - without sacrificing single-threaded performance while increasing multi-threaded throughput"

---
If somebody else reaches this post without reading what I wrote earlier, then you should not bother with my post here, and instead just read this post of mine just before this one.

jvroig · Aug 28, 2010

Idontcare said:
There is no way I'm going to believe english is not your first language, you've totally mastered the art of cultural nuance.

I grew up as a child thinking I was just as American as the people in the shows I watched (cartoons and real live people)... but when I grew up, I realized I am not American at all

Some Bulldozer and Bobcat articles have sprung up

Banned

Elite Member

Banned

Diamond Member

Elite Member

Elite Member

Banned

Platinum Member

Golden Member

Lifer

Diamond Member

Diamond Member

Elite Member

Elite Member

Lifer

Platinum Member

Diamond Member

Diamond Member

Platinum Member

Diamond Member

Elite Member

Platinum Member

Diamond Member

Platinum Member

Platinum Member