Some Bulldozer and Bobcat articles have sprung up

Page 3 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Sp12

Senior member
Jun 12, 2010
799
0
76
JFAmd, I know you're a server-side guy, but if the server bulldozers are drop-in upgrades, are the desktop CPUs going to be AM3 compatible?
 

Schmide

Diamond Member
Mar 7, 2002
5,745
1,036
126
I think AMD will need software to be compiled to use it's FMAC units, but since AMD's FMAC won't be compatible with Intel's future FMAC I don't hold much hope for support outside of HPC until both AMD and Intel supports the same standard.

Why would AMD need software to be recompiled?

The way I understand it FMAC4 (Bulldozer) is FMAC3 (Haswell) with the ability to assign the accumulation to a 4th register. It seems to me that FMAC4 is a super set of the proposed operations and all FMAC3 operands could decode into FMAC4 operations.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Why would AMD need software to be recompiled?

The way I understand it FMAC4 (Bulldozer) is FMAC3 (Haswell) with the ability to assign the accumulation to a 4th register. It seems to me that FMAC4 is a super set of the proposed operations and all FMAC3 operands could decode into FMAC4 operations.

Doesn't the software need to be compiled with an FMAC3 or FMAC4 aware compiler before an exectuble can be created which would utilize such instructions?
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
From what I understand AMD's four operand FMA is not compatable with Intel's three operand FMA. They will be different instructions and will require different code paths.

I haven't really gone and looked at IBM's FMA implementation, but I think they may be three operand also.

Edit: IBM uses four operand FMA. I wonder why Intel is using FMA3?

Edit Again:
"Since we don’t control the definition of AVX, all we can say for sure is that we expect our initial products to be compatible with version 5 of the specification (the most recent one, as of this writing, published in January of 2009), except for the FMA instructions"

" At some future point, we will likely adopt Intel’s newer FMA definition as well"

http://blogs.amd.com/developer/2009/05/06/striking-a-balance/
 
Last edited:

bryanW1995

Lifer
May 22, 2007
11,144
32
91
I agree with you, since most of the calculations are done on the integer unit, make sense doubling them, we don't know how much AMD beefed up each unit to increase IPC, but AMD's attack against Intel's HT is throwing more hardware at it. Hyper Threading benefits of the bubbles in the execution pipeline, but what if a thread is able to fullfill the pipeline? Or what if two thread stalls within the pipeline? Hyper Threading may actually cause a performance drop, plus HT its known to have the cache pollution issue. And AMD has a pretty strong FPU performance, so makes sense to beef up the integer unit.

A pretty basic example of the hardware solution in heavy multi threading scenario, the Anandtech moderator Mark, stated that in F@H, the X6 1090T was actually faster than his heavily overclocked i7 930 or 940, dont remember which one, because you have 4 phi\ysical cores with 4 logical cores sharing execution resources compared to 6 cores with their individual execution resources, if only AMD had better IPC, the performance advantage would be even bigger.

mark had to turn off ht on his i7. iirc the i7 @ 3.7 was slightly better than the thuban @ 3.6 until ht was turned off, then the thuban opened up a can of whoop ass. hopefully mark will pitch in here presently.
 

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
P.S. I don't think many people would be concerned about CPU media encoding provided these tasks were better handled on a Discrete Video card or fusion iGPU.
Except that such implementations suck. One of the things that low-latency communications between the GPGPU and CPU should allow for, is for software like that to take advantage of the GPU bits, but be able to use the CPU for branchy goodness. Currently, latency is so high as to be prohibitive for any code that is not exceptionally loosely coupled.

AMD hyped it up first, but Intel will actually get there, first, by the looks of things. AMD is currently lacking in software more than hardware, though, and hopefully they will set aside some funds to fix that. nVidia went from wishy-washy on anything not CUDA to, "here's nearly complete support." Intel just hasn't done much publicly since the Larrabee failure, but make no mistake: when they feel ready, it be like nVidia did, with some polish. AMD needs to put a few good people full-time on nothing but their GPGPU software support, both the SDK and drivers. If they can do a good enough job, big companies may end up supporting all three companies within their products, and small companies, along with free projects, could move to generic OpenCL and DirectCompute, with vendor-optimized versions of a single code base (going for the most performance for the dev/support cost, rather than simply max performance, while still getting benefits from everyone's extra hardware).

Personally, I want a >13" very light Bobcat notebook, with a real keyboard, and not pay a mint for it, with a crappy high-power chipset (IE, the larger Atom ones, that fall flat compared to CULV and netbooks), and I don't want Intel's IGP. If a quality maker, like Asus, Samsung, Lenovo, HP (a business model), etc., come out with such an animal, I'd buy one, and finally retire my PIII Thinkpad (Atom+ION is fast enough, but I want a real keyboard, dammit!). I was going to hold out for Cortex A9 machines, but I think there's a better chance of a good Bobcat.
 

brybir

Senior member
Jun 18, 2009
241
0
0
I think this is AMD strategy. Its mid and entry level are becoming AMD Fusion with an IGP already included. I guess theyre assuming if your looking at Bulldozer you'll have a discrete card capable of offloading most FP tasks to a GPU. This is their idea of the future.


My understanding of the market direction is that with Fusion type products and MS's development of DirectCompute GPGPU programming, the trend will be that certain workloads that are heavy FPU workloads will be handed off to the integrated GPU. Last I heard, we can expect a 400 to 800SP integrated graphics core that, in certain workloads, make the bulldozer FPU look like a pentium III in relative performance.

So the Fusion product will have its bases covered: Strong integer performance, good FP performance and exceedingly good GPGPU performance where the workload makes sense to offload. Eventually it seems that when we talk about CPU's in maybe 5 years, we will talk about integer cores, FPU cores and GPGPU cores.

I think the ultimate goal is this: Within the physical die space, their is room for a certain amount of "modules". These modules can be bulldozer CPU cores, they can be ATI GPGPU cores, or they can be specialized cores. Ideally, AMD could create products that have say, a dedicated cryptographic "module" sitting alongside the Bulldozer core and GPGPU core for clients that can benefit from that, and assuming a market exists for a product. Really, if they can achieve that level of flexibility, it will drastically change the CPU market and allow for much greater vendor differentiation. Instead of "Intel is at 3.6 on its Westmere core, up from 3.4ghz" we will see "Intel has released its new processor with dedicated cores for geological exploration data analysis" while AMD will release "new processor with dedicated cores for running visualization workloads".

At least I hope it ends up working like this!
 

brybir

Senior member
Jun 18, 2009
241
0
0
Except that such implementations suck. One of the things that low-latency communications between the GPGPU and CPU should allow for, is for software like that to take advantage of the GPU bits, but be able to use the CPU for branchy goodness. Currently, latency is so high as to be prohibitive for any code that is not exceptionally loosely coupled.

AMD hyped it up first, but Intel will actually get there, first, by the looks of things. AMD is currently lacking in software more than hardware, though, and hopefully they will set aside some funds to fix that. nVidia went from wishy-washy on anything not CUDA to, "here's nearly complete support." Intel just hasn't done much publicly since the Larrabee failure, but make no mistake: when they feel ready, it be like nVidia did, with some polish. AMD needs to put a few good people full-time on nothing but their GPGPU software support, both the SDK and drivers. If they can do a good enough job, big companies may end up supporting all three companies within their products, and small companies, along with free projects, could move to generic OpenCL and DirectCompute, with vendor-optimized versions of a single code base (going for the most performance for the dev/support cost, rather than simply max performance, while still getting benefits from everyone's extra hardware).

Personally, I want a >13" very light Bobcat notebook, with a real keyboard, and not pay a mint for it, with a crappy high-power chipset (IE, the larger Atom ones, that fall flat compared to CULV and netbooks), and I don't want Intel's IGP. If a quality maker, like Asus, Samsung, Lenovo, HP (a business model), etc., come out with such an animal, I'd buy one, and finally retire my PIII Thinkpad (Atom+ION is fast enough, but I want a real keyboard, dammit!). I was going to hold out for Cortex A9 machines, but I think there's a better chance of a good Bobcat.


The big player that we dont hear much about is MS, who is constantly developing its DirectCompute API that was released along with DirectX.

One example of it is: http://www.microsoft.com/showcase/en/us/details/6ef116dc-b1d9-41db-8a7b-db1932ff72a5

You can note from this demo that the frame rate sucks, but the water is extremely realistic looking. Cutting to the chase, GPGPU is still in its infancy and serving in a niche market. It is gaining traction, but it is still very challenging to program for and their are many types of data that are not easily computed on a GPU. As it gains traction I suspect the general standards options like DirectCompute, which works on all vendors products that can use DirectX 10 and 11, will take over, rather than one companies implementation.

If I were AMD, I would probably be expecting that when GPGPU really catches on in mainstream programming, it will be in the form of whatever the big players like MS and Intel get behind, so why throw lots of resources into something like AMD's Stream SDK when it is going to be obsolete in a few years.

Just my guess anyways.
 

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
And, that could be. While the water may or may not look cool, the ultimate goal out of gaming will be to offer better performance within a given power and memory budget (browsers using GPGPU features, FI), along with server and workstation apps giving more performance in total per dollar. In games, it will be to start moving away from the polygon and texture map system we use today, which, while it scales fairly well, it finally starting to hit the wall, wrt to power. Fixed-function raster units ('cause they're just so fast at what they do, for the die space and power), mixed with different rendering techniques used by the rest of it. Die shrinks won't do enough good for more than one more generation of consoles, FI, and I think nVidia's GTX 480 was very much a Prescott moment.

If MS can give DirectCompute a flexible enough view of memory, as time goes on, it may be all we need, outside of the FOSS world. With FOSS, a great deal depends on how much work is done on front-end support, v. back-end support, and that's still hazy (IE, more support for OpenCL and the like, or more support for LLVM w/ GPGPUs, to allow different front-ends to work well on their own?).
 

jvroig

Platinum Member
Nov 4, 2009
2,394
1
81
I guess I see it a different way. Each core or half module is smaller now since it cut down the number of IEU per core and maybe managed to save some area for the quasi-merged FP unit/scheduler across cores. So it could be that in terms of die area, 1 module is roughly the same area as 1 core and assuming AMD is pricing based on area then for the same money you can get 4 modules of Bulldozer against a 4 core Phenom II.

I could be completely wrong since I haven't looked at any of the die area numbers yet.
They've already said their first Zambezi chip will be a quad-core. And they've already mentioned before that the performance of one int core of Bulldozer is greater than the int performance of Deneb.

Putting it all together, we get the picture that a quad Bulldozer > quad Deneb.

So, what I was pointing out in the post you quoted was that counting ALU/AGU is only part of the equation (not the enitre picture), and that it was wrong in the first place because the poster made it look like ALU/AGU increased, when in fact ALU/AGU was decreased per core. "Decreased, yet performance went up" is the picture,instead of "increased and performance went up". The overhaul in architecture is supposedly what makes this possible - from there, we can also reasonably infer that perhaps K10's 3 ALU / 3 AGU is actually overkill, and does not actually get utilized anywhere close to its max potential, hence AMD did not go around adding more ALUs and AGUs - instead, they removed one ALU and AGU, and then optimized the relevant parts of the architecture to make sure the two are fed much better, hence a performance increase was achieved.

You are the Intel engineer/designer, right? In your case, Intel vs AMD, then yes, perhaps AMD is going to sell their octo-cores at a similar price range to your quad, that's plausible, who knows.




Is Hot Chips finally over? Are there new "updated" articles floating around now?
 

386DX

Member
Feb 11, 2010
197
0
0
JFAmd, I know you're a server-side guy, but if the server bulldozers are drop-in upgrades, are the desktop CPUs going to be AM3 compatible?

I have a bad feeling that BD is gonna disappoint alot of people. Firstly while its nice to have a bunch of slides about BD why isn't there any working demo? If the chips suppose to be out in less then a year there should be a working samples out.

Secondly regarding upgradeability for AM3 socket there hasn't been any official statements that says it will be able to. Actually all the info suggests otherwise. From http://www.tomshardware.com/reviews/bulldozer-bobcat-hot-chips,2724-4.html

Bulldozer: One of two new x86 architectures, Bulldozer will be used in performance desktops and servers. Bulldozer-based modules will serve as the basis for AMD’s next generation of processors. The company has already confirmed that it’ll maintain socket compatibility with existing Magny-Cours-based Opteron processors. Thus, you can expect to see Bulldozer-based CPUs dropping into existing server boards and, likely, Socket AM3 desktop platforms as well. AMD’s target power use for Bulldozer-based chips is between 10 and 100 W.
The key word there is "likely", that's not very reassuring. And the Gizmodo article http://gizmodo.com/5620423/amd-announces-8+core-bulldozer-cpu makes sound like it won't be upgradeable.

AMD officials say Bulldozer is being targeted at servers and performance desktop machines. The good news is that Bulldozer will be drop-in compatible with most current high-end servers. The bad news is that it won't be compatible with existing AM3 boards. Instead, AMD says it will introduce a new AM3+ socket. These sockets will be backward compatible with older chips so you could drop a Phenom II X6 in it.
It sounds like you can put Thuban CPUs in the new BD platform but can't put BD in the current Thuban platform. Same pin layout but possibly different power requirements may explain the vagueness about AM3 upgradeability.

Finally performance wise I think its gonna be a disappointment for those who think BD is gonna magically catch or exceed Intel's Core i7 clock for clock. If the Register article is accurate http://www.theregister.co.uk/2010/08/24/amd_hot_chips/ it seems like BD is gonna be even slower then current AMD chips clock for clock.

Fruehe has also said in interviews with El Reg that Bulldozer's shared component approach results in a Bulldozer module with two quasi-cores, and yields about 1.8 times the performance as two current Magny-Cours cores. That's a 10 per cent performance hit, clock for clock, for every pair of cores, but much lower power consumption because of the shared nature of the Bulldozer modules.
That statement fits in with early reports of "systems with 33 per cent more cores and 50 per cent more performance", ie 16-core BD is 50% faster then a 12-core Magny. The article states that the high end BD 16 core will be around 2.75 GHz.
 

Scali

Banned
Dec 3, 2004
2,495
0
0
nVidia went from wishy-washy on anything not CUDA to, "here's nearly complete support."

What do you mean by that?
nVidia has always supported all GPGPU standards.
Cuda just was the only option at first. But when Apple proposed OpenCL, nVidia has given full support, and was actually the first to deliver decent OpenCL drivers on the Apple platform.
nVidia also has bundled OpenCL with their public end-user drivers since November. AMD still doesn't offer an OpenCL runtime, so to me that means: they don't support it.
Likewise, although nVidia didn't have DirectX 11 hardware out yet, they supported DirectCompute on their DX10 hardware before AMD did, and all the way back to the venerable G80, the thing that kickstarted Cuda, and thereby OpenCL and DirectCompute.
AMD only supports 4000-series (poorly) and 5000-series.

As far as I see it, nVidia was never the one that was 'wishy-washy'.
 
Last edited:

Scali

Banned
Dec 3, 2004
2,495
0
0
As it gains traction I suspect the general standards options like DirectCompute, which works on all vendors products that can use DirectX 10 and 11, will take over, rather than one companies implementation.

It *should* work on all DirectX 10 hardware, as DirectCompute supports shadermodel 4.0 (DX10.0) and 4.x (DX10.1).
However, you need support in the driver. On vanilla DX10 drivers it will not work. I think currently Intel is the only one that doesn't have DirectCompute support in their drivers though (yes, nVidia, AMD and S3 too, they support it).
At first I had to enable DirectCompute in the registry on my 8800GTS card. I think it's now enabled by default though.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,362
136
jvroig said:
A 4 module Bulldozer does not go against a 4 core Phenom II. Rather, it would be a 2 module Bulldozer (thus, quad-core vs quad-core), which would then make it: 8 vs 12(Deneb), instead of 16 vs 12 as you stated

Bulldozer will replace for AMD the Deneb (Quad Core) and Thuban (Hexa Core) Phenoms in the HighEnd Desktop, so its only naturall to compare a 4 Module (8 Logical Cores) Bulldozer against a Quad (4 Physical Cores) and Hexa (6 Physical Cores) Phenom.

The same happened when Core i7 9xx (8 Logical Cores) was introduced and everyone compared them against Core 2 Quad (4 Physical Core).
 

Scali

Banned
Dec 3, 2004
2,495
0
0
The same happened when Core i7 9xx (8 Logical Cores) was introduced and everyone compared them against Core 2 Quad (4 Physical Core).

I think the market positioning (price) had more to do with that than the number of cores.
Aside from that, Core i7 9xx was 4 physical cores as well. HT adds negligible extra logic/cost to the CPU. Pentium 4 HT was also compared to regular single-core Pentium 4s and Athlons. That's where it fit in the market.
Besides, what else were you going to compare Core i7 to? There was nothing with more than 4 cores anyway.

If Bulldozer has significantly better performance, then I can see AMD raising the prices, and then a 2 module Bulldozer will compete with current 4-core CPUs...
If not, then AMD may have to continue the Thuban trick: more cores at a lower price. In which case we might see 4 module Bulldozers competing with 4-core CPUs.
 

jvroig

Platinum Member
Nov 4, 2009
2,394
1
81
The same happened when Core i7 9xx (8 Logical Cores) was introduced and everyone compared them against Core 2 Quad (4 Physical Core).
The difference being the i7 9xx was marketed as a quad.

Zambezi will be marketed in terms of cores, not modules. When the first Zambezi arrives, it will be a quad-core. Quad-core = 2 modules.

People should just forget modules, because it won't matter anywhere: Marketing will label Bulldozer SKU's according to cores (not modules), the OS will see cores (not modules), so apps will naturally make use of cores (not modules). Aside from architecture PowerPoint slides, modules don't exist anywhere else.

4 Module (8 Logical Cores) Bulldozer
That's 8 physical cores. You are letting the whole "half-cores / mini-cores / modules" language distort your idea of what is a logical and a physical core, especially when in the same context you use the same for hyperthreading.



At the risk of repeating what I've already said a few posts up:
1.) ALUs and AGUs decreased per core, not increased.
2.) Int performance per core increased, not decreased, despite kicking out an ALU and AGU - what we can infer is that there were too much ALUs and AGUs before
3.) Counting ALUs and AGUs are meaningless if you take it in isolation outside of the rest of the architecture
 

Scali

Banned
Dec 3, 2004
2,495
0
0
That's 8 physical cores. You are letting the whole "half-cores / mini-cores / modules" language distort your idea of what is a logical and a physical core, especially when in the same context you use the same for hyperthreading.

Because of the shared resources in a module (eg decoder, FPU), I'm not sure if you can speak of 'physical cores' with Bulldozer, to be honest.
I think we can say this:
A Bulldozer module is similar to one physical core on a HT processor: It contains two logical cores.
Logical cores on Bulldozer and HT processors can be considered equivalent.

But I'm not sure what a 'physical core' would be for Bulldozer. I think perhaps we should not even try to define it, as it isn't very relevant.

But yes, I think AMD will be marketing it on their logical core count.
 

jones377

Senior member
May 2, 2004
462
64
91
Even AMD calls the cores in Bulldozer logical, just look at the presentations :) I remember arguing that with JF some months back and he flat out denied they would use the logical core model for software. Thank god he's not the chief engineer on the Bulldozer project :)
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,362
136
JFAMD has said that the OS will see a Bulldozer Module as 2 LOGICAL cores, so a 4 Module will be seen in the Task Manager as 8 Logical Cores.

But don’t take the cores, 6-Core Thuban is the current HighEnd CPU from AMD and it will be replaced by a 4 Module Bulldozer.

If you want to be fair in the comparison on a Technical level (Cores) you have to compare a 2 Core SB (4 Logical Cores) to a 2 Module Bulldozer (4 Logical Cores) and the same with Quad Core SB (8 Logical Cores) to a 4 Module Bulldozer (8 Logical Cores).
 

jvroig

Platinum Member
Nov 4, 2009
2,394
1
81
Even AMD calls the cores in Bulldozer logical, just look at the presentations
I did, and I just reviewed them now through AT Gallery. I can't see where they called it logical cores. The only mention of 'logical' I saw was in slide 13, where they mentioned "each core is a logical processor from the viewpoint of software". That is far from an admission that each core is simply a logical core. That statement just means "despite all the shared-resources in a Bullldozer (module), the OS will see them as two cores and make use of them as such".

That makes sense, especially when at slide 6 they describe a Bulldozer (one module) as a monolithic dual core. Not a "fat, powerful core that appears as two logical cores, like HT made better". As far as AMD is concerned, they started with two full cores, then started tweaking until they had a mix of independent and shared resources such that there would be throughput advantages in multi-core scenarios and far less performance penalty in single-threaded scenarios ("penalty" being in relation to those cores being isolated totally, not against a Deneb core; it has been stated performance from Deneb has, in fact, increased).

If you want to be fair in the comparison on a Technical level (Cores) you have to compare a 2 Core SB (4 Logical Cores) to a 2 Module Bulldozer (4 Logical Cores) and the same with Quad Core SB (8 Logical Cores) to a 4 Module Bulldozer (8 Logical Cores).
If you did that on a "Technical level", it still won't exactly be fair, right? Because when you take away the semantics ("logical" vs "physical"), each Bulldozer core will have more hardware dedicated to it than the corresponding SB logical core through hyperthreading (assuming the same as Nehalem, which is not too unreasonable an assumption).

We will be stuck arguing semantics, and HT benefits / tradeoffs versus using "real cores" (AMD tagline), so I will let this go as I do not want to keep on repeating myself.

Before I end, let's be clear why I posted in the first place: when you made your post that I tried to answer, it is not because something is "fair" or "unfair".

Rather, it was due to the point that your post seemed to come across - that ALU + AGU count increased, and that further architecture changes is responsible for being able to feed an increased number of ALU+AGU. In fact, they did not. They decreased ALU + AGU count per core. It only seemed so because you compared apples to oranges. ALU + AGU count only matter core to core, because any excess ALU in another core will not help an active core that could have used another ALU (but this is a contrived scenario; in fact, we know the ALUs are hard to feed, and when AMD dropped one, we know the number of ALUs are not the bottleneck to performance). This is why that post of mine ended with this:

... which makes me wonder why you bothered comparing an octo-core Bulldozer against a quad-core Deneb to count ALUs and AGUs in the first place.
It's about the ALU+AGU count, not about what is fair or not. If you want to compare based on performance level, or dollar value, or market segment, then go ahead and do so. I was not concerned with that, and if you review that post of mine, you will see I did not bother with anything but correcting the context for ALU+AGU concerns.
 

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
What do you mean by that?
That nVidia went on pretty because of Apple, like everyone did, and had very little outside of Apple's stuff to show for it, and tended to downplay it. Then, AMD, who had been talking it all up like the second coming, didn't have anything, and didn't then have anything, repeat over and over again. Many people were wanting something they could port later on, or port between OS X and whatever, or for whatever other reason, so nVidia took the opportunity to hurry up and get it out there, and in their drivers.

IE, I saw it as nVidia pushing good OpenCL support as soon as they did in large part as opportunistic move against AMD/ATi, as AMD had been talking it up without having the software to back it up, more-so than just by market demand, which was and is not as great as for their own CUDA, or MS' DirectCompute.

AtenRa said:
If you want to be fair in the comparison on a Technical level (Cores) you have to compare a 2 Core SB (4 Logical Cores) to a 2 Module Bulldozer (4 Logical Cores) and the same with Quad Core SB (8 Logical Cores) to a 4 Module Bulldozer (8 Logical Cores).
There isn't a more fair comparison. Intel's CPUs with HT are actually four cores, but with extra sets of registers and other related goodies, so that when it stalls, it doesn't just sit idle. An optimized application, though, can get little benefit from it, or will even run slower, and cause increased power consumption in the process, sometimes (less so w/ the i7, but that problem w/ the P4 I'm sure played a part in ditching it for the Core 2s). What AMD is basically doing is improving their perf/watt in ways that don't need die shrinks, but also going, "hey, these units are always on, and are very high bandwidth, so if we beefed them up some, shared them between cores, we could save plenty of space and power, yet only have a very small performance loss in some rare corner cases." Two modules is still really four cores, just not four completely separate cores.

The strict separation of logical and physical threading resources will be going away. Intel and AMD have both had this on their long-term road maps for some time (2020 and beyond, FI). AMD is just making the move first (well, I think Sun made the first big move, but that wasn't x86 :)), because they can't die-shrink their way into better power consumption, nor can they be reckless with R&D money.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,362
136
I compared Bulldozer’s ALUs and AGUs with both Quad Core and Hexa Core Phenoms and said that if you take the whole CPU (4 Module) you will see an increase of ALUs and AGUs in comparison to a Quad Core Phenom.

Yes if you take the Bulldozer Integer Execution Unit it has one ALU and AGU les than what Deneb or Thuban has, but a CPU consist of more than one Int Core.

Anyway AMD says, that Bulldozer’s 4 way vs Deneb/Thuban 6 way exectusion pipe will be faster so there’s no point counting the ALUs/AGUs.
 

Scali

Banned
Dec 3, 2004
2,495
0
0
That nVidia went on pretty because of Apple, like everyone did, and had very little outside of Apple's stuff to show for it, and tended to downplay it.

Then I disagree.
nVidia had more to show for it than AMD did (ever looked at their GPGPU SDK? It contains quite a bit of OpenCL samples, and more and better documentation than AMD has... even a nice Cuda-to-OpenCL migration guide)... Thing is just that AMD marketed the heck out of OpenCL (which was still vapourware on their side) because they didn't have a working product on the market, like nVidia did.

IE, I saw it as nVidia pushing good OpenCL support as soon as they did in large part as opportunistic move against AMD/ATi

Not really, since nVidia was the first to have working OpenCL-compliant GPU drivers.
How can it be a move against your competitor when you're the one who's ahead?
I guess non-developers just have a different view on the situation, because they go by what marketing says, rather than by what is actually available.
nVidia never really marketed their OpenCL, and still don't. But they've been ahead of AMD prettty much the whole time.
For nVidia it's just not very interesting to be marketing OpenCL, just like it's not interesting for them to market OpenGL (even though they're currently the only ones supporting OpenGL 4.1 I believe).
Likewise, neither nVidia nor AMD market DirectCompute, because that's not interesting either. AMD put its eggs in the OpenCL basket... perhaps because it's a multiplatform solution and open standard.