AMD summit today; Kaveri cuts out the middle man in Trinity.

Idontcare · Jun 19, 2012

CPUarchitect said:
Because AMD is offering them free beers?

Khato said:
And PR statements are cheaper than said beers last I checked, right?

ShintaiDK said:
It cost nothing to endorse a project. Look back in history on how many of such projects there has been and how few materialized into something. Alot of companies uses it for free PR, since their logo gets shown alot of places.

When I look at the players who are throwing their names into the HSA hat I don't worry about the possibility that it is all fluff and no substance.

Look at those names, they all have one thing in common - they all need something like HSA to take off so they have something to sell chips to.

Who doesn't need HSA? Intel. Intel is in a position where they could take it or leave it, not sure the other folks can really have HSA be hype and nothing more.

Necessity is the mother of invention. Companies like AMD, ARM and TI would be crazy to not take HSA seriously. It may still fail, but I can only think of one company that can afford HSA failing and that is Intel, everyone else needs that tide to lift all the boats in the harbor. That's not "sign me up for free beer" stuff.

CPUarchitect · Jun 19, 2012

piesquared said:
lmao of at the level of denial.

Denial of what exactly? The HSA Foundation web site is practically empty. There's some marketing talk, but nothing of real substance.

So why would AMD succeed where NVIDIA failed (i.e. creating added value for GPUs beyond graphics, in the consumer market), while CPUs are catching up in throughput computing? What's the technological breakthrough that makes a multi-core wide-vector GPU better than a multi-core wide-vector CPU?

If HSA is the long missing "standard" for heterogeneous computing, then why isn't NVIDIA in on it? I'm sure they would love to win back some low-end GPU market share from Intel. Unless they think the whole thing doesn't stand much of a chance, and they believe they shouldn't sacrifice any graphics peformance for GPGPU?

Homeles · Jun 19, 2012

CPUarchitect said:
Denial of what exactly? The HSA Foundation web site is practically empty. There's some marketing talk, but nothing of real substance.

So why would AMD succeed where NVIDIA failed (i.e. creating added value for GPUs beyond graphics, in the consumer market), while CPUs are catching up in throughput computing? What's the technological breakthrough that makes a multi-core wide-vector GPU better than a multi-core wide-vector CPU?

If HSA is the long missing "standard" for heterogeneous computing, then why isn't NVIDIA in on it? I'm sure they would love to win back some low-end GPU market share from Intel. Unless they think the whole thing doesn't stand much of a chance, and they believe they shouldn't sacrifice any graphics peformance for GPGPU?

Why would Nvidia want to have anything to do with AMD? Is it not a bit early to be saying, "well there's no information about it, so it's going to be a flop!" when HSA isn't even close to being implemented?

Besides, HSA is an open standard. Nvidia and open standards are two nouns that should not be in the same sentence.

piesquared · Jun 19, 2012

CPUarchitect said:
Denial of what exactly? The HSA Foundation web site is practically empty. There's some marketing talk, but nothing of real substance.

Yeah. This is the kind of meaningless point I would expect the detractors to zero in on. Where's the AVX2 hardware? And when it does finally trickle out onto the market, how does it benefit non x86 IHV's? HSA benefits the entire industry, not a select few.

So why would AMD succeed where NVIDIA failed (i.e. creating added value for GPUs beyond graphics, in the consumer market), while CPUs are catching up in throughput computing? What's the technological breakthrough that makes a multi-core wide-vector GPU better than a multi-core wide-vector CPU?

User base volume. Vendor agnostic. ISA agnostic. Power efficiency. Performance. For starters.

If HSA is the long missing "standard" for heterogeneous computing, then why isn't NVIDIA in on it? I'm sure they would love to win back some low-end GPU market share from Intel. Unless they think the whole thing doesn't stand much of a chance, and they believe they shouldn't sacrifice any graphics peformance for GPGPU?

NV is clinging to CUDA and other proprietary standars as you know. They aren't interested in open standards at this point.

CPUarchitect · Jun 20, 2012

Homeles said:
Why would Nvidia want to have anything to do with AMD?

What would HSA have to do with AMD? Isn't it supposed to be an open standard, with all adopters being treated equally?

Besides, HSA is an open standard.

Nvidia and open standards are two nouns that should not be in the same sentence.

And yet NVIDIA supports OpenCL.

Clearly HSA isn't as "open" as AMD would like you to believe. Furthermore, it actually requires considerable hardware compromises to support HSA efficiently. Basically they would be asking NVIDIA to sacrifice graphics performance to give AMD's heterogeneous computing technology a remote chance of succeeding.

It's really AMD that should be asking "why would NVIDIA want to have anything to do with AMD?" Unless they can convince them, how can there ever be a widely adopted standard?

Homeles · Jun 20, 2012

I'm really disappointed that you've resorted to attacking a straw man.

CPUarchitect said:
What would HSA have to do with AMD? Isn't it supposed to be an open standard, with all adopters being treated equally?

I'm not sure if you're aware of this, but AMD and Nvidia have kind of been rivals since AMD's acquisition of ATI in 2006. Nvidia's not going to simply jump on board -- they've got a certain ace up their sleeve that they've been working on. Why would they dump Project Denver for HSA?

And yet NVIDIA supports OpenCL.

Obviously they support it because it would be moronic not to. They'd lose millions. If they had their way though, everything would use CUDA and OpenCL wouldn't exist.

Clearly HSA isn't as "open" as AMD would like you to believe.

You have nothing to back up that notion. What can be asserted without evidence can be dismissed without evidence. Mind you, there's strong evidence supporting the contrary -- ARM, TI and others are onboard, which would suggest that it's not AMD exclusive.

Furthermore, it actually requires considerable hardware compromises to support HSA efficiently. Basically they would be asking NVIDIA to sacrifice graphics performance to give AMD's heterogeneous computing technology a remote chance of succeeding.

Here we go... now that you've ended the fallacy, we can have a decent conversation.

Nvidia wouldn't necessarily be giving up graphics performance. They're already headed down the heterogeneous computing road, however every road that Nvidia has ever taken has been clearly staked out as their road. JHH is extremely prideful of his company -- he's not going to travel down the street holding hands with Rory Reed. If Nvidia's going to team up with AMD, it'll be Nvidia in charge of things.

It's really AMD that should be asking "why would NVIDIA want to have anything to do with AMD?" Unless they can convince them, how can there ever be a widely adopted standard?

JHH has always been stubborn. Who knows what it would take to get them to work together? At the very least, AMD is not the shining star of the industry. If they're going to prove their worth to Nvidia, they're going to have to show they're capable and not simply say they are.

CPUarchitect · Jun 20, 2012

piesquared said:
Yeah. This is the kind of meaningless point I would expect the detractors to zero in on. Where's the AVX2 hardware?

I wasn't talking about hardware, I was talking about specifications! We already know AVX2 and TSX will be successful well before Haswell arrives, because Intel has released the specification a year ago, and it includes an incredible set of powerful features. So all I'm asking is how will AMD trump each of those features? No need to name specific hardware parts or when they become available. Just please sum up the technology they're going to use, because the HSA website is empty.

Let's not forget that the battle of homogeneous versus heterogeneous computing will last for several years (unless AMD gives up sooner), so it's insignificant that we'll have to wait one more year for Haswell. Also note that the HSA hardware won't be ready before 2014.

"HSA isn't even close to being implemented" - Homeles

So that's not what's bothering me. They simply haven't specified how they'll achieve their claims. I'm sure it will be an improvement over their own previous technology (although Bulldozer wasn't very comforting), but how will it exceed the competition's technological leaps?

And when it does finally trickle out onto the market, how does it benefit non x86 IHV's? HSA benefits the entire industry, not a select few.

Nothing prevents other CPU manufacturer's from creating an instruction set extension equivalent to AVX2. For instance ARM's NEON already has FMA, so the main missing feature is gather support. And since ILP and TLP are becoming exhausted the only parallelism left to exploit is DLP. So they should already be looking into widening the vectors as well.

Clearly HSA doesn't benefit the entire industry. NVIDIA and Intel are not adopting it.

User base volume. Vendor agnostic. ISA agnostic. Power efficiency. Performance. For starters.

AVX2 will have a much bigger install base due to Intel's market dominance. AMD will eventually have to support it too (note that they already have AVX plus FMA), so the install base will grow even faster and it will be largely vendor agnostic. And as noted above other CPU manufacturers are very likely working on their own homogeneous throughput computing technology. It's also no less ISA agnostic than HSA, since the compatibility between adopters happens at the intermediate representation or high-level language level. Homogeneous computing even supports far more languages. The power efficiency of Intel's CPUs is also incredibly good, and will likely improve further with AVX-1024. And performance-wise it looks like the computing density of the CPU is getting really close to that of the GPU, while it has out-of-order execution and high cache hit rates to ensure high effective performance, without suffering from heterogeneous communication overhead.

AtenRa · Jun 20, 2012

CPUarchitect said:
If HSA is the long missing "standard" for heterogeneous computing, then why isn't NVIDIA in on it?

They already are on it (not in x86) with ARM Microprocessors + NVIDIA iGPUs, NVIDIA's Project Echelon.

Implementation of such a design can be put to Cell Phones, pads/Laptops and all the way up to servers.

CPUarchitect · Jun 20, 2012

Homeles said:
I'm really disappointed that you've resorted to attacking a straw man.

What straw man? You're accusing me of an argumentation fallacy without specifying it.

I'm not sure if you're aware of this, but AMD and Nvidia have kind of been rivals since AMD's acquisition of ATI in 2006. Nvidia's not going to simply jump on board -- they've got a certain ace up their sleeve that they've been working on. Why would they dump Project Denver for HSA?

Again, that's a question for AMD to answer, not me. I fully agree that NVIDIA won't just adopt HSA. But that means the HSA install base will remain small, which is a huge problem for AMD.

Obviously they support it because it would be moronic not to. They'd lose millions. If they had their way though, everything would use CUDA and OpenCL wouldn't exist.

Sure, but shouldn't it be AMD's goal to make sure that NVIDIA would be "moronic" not to adopt HSA? Isn't that what the "open standard" is supposed to achieve? Clearly they're not succeeding.

You have nothing to back up that notion.

Sure I do. If it was open instead of really being controlled by AMD, NVIDIA wouldn't have had any reservations for adopting HSA.

That, or NVIDIA simply doesn't fully believe in GPGPU for the consumer market any more. Your choice. Either way HSA loses.

Mind you, there's strong evidence supporting the contrary -- ARM, TI and others are onboard, which would suggest that it's not AMD exclusive.

Remind me why ARM or TI would have a problem with AMD sharing heterogeneous computing specifications? It's not like they were working on their own software specification, or they even compete in the same market space.

Also note that 3DNow! was adopted by both Cyrix and IDT, before it got obliterated by SSE. So I'm not impressed by this HSA Foundation. We'll have to see some long-term real commitment from the other members before it can be considered not AMD exclusive. Else they could just be in it for the "free beer", i.e. any incentive other than being convinced of the technology's superiority.

JHH is extremely prideful of his company -- he's not going to travel down the street holding hands with Rory Reed. If Nvidia's going to team up with AMD, it'll be Nvidia in charge of things.

Again you're only confirming that it's going to be really hard to ever reach consensus on heterogeneous computing technology. HSA will only cause further fragmentation. Meanwhile Intel doesn't need any API or language standard to make homogeneous throughput computing successful. Heck, AVX2 is already supported by major compilers and will soon auto-vectorize loops in legacy code.

CPUarchitect · Jun 20, 2012

AtenRa said:
They already are on it (not in x86) with ARM Microprocessors + NVIDIA iGPUs, NVIDIA's Project Echelon.

Implementation of such a design can be put to Cell Phones, pads/Laptops and all the way up to servers.

Not even close. Echelon is a research project for "extreme-scale" computing. A single node is specified to have 20 TFLOPS of computing power, and 256 GB of RAM. That's not going to be put into cell phones any time soon. It's clearly aimed only at the supercomputer market. I was talking about the consumer market, and NVIDIA has quite clearly taken a step back there to focus on graphics performance.

piesquared · Jun 20, 2012

CPUarchitect said:
I wasn't talking about hardware, I was talking about specifications! We already know AVX2 and TSX will be successful well before Haswell arrives, because Intel has released the specification a year ago, and it includes an incredible set of powerful features. So all I'm asking is how will AMD trump each of those features? No need to name specific hardware parts or when they become available. Just please sum up the technology they're going to use, because the HSA website is empty.

Specification have been finalized and handed to participating parties. Clearly you/intel are not one of those participating parties so i'm not sure why you would be so concerned. Basing your conclusions off a website seems rather frivolous.

Refer to Phil Hughes keynote if you have any real interest: https://vts.inxpo.com/scripts/Server.nxp

Let's not forget that the battle of homogeneous versus heterogeneous computing will last for several years (unless AMD gives up sooner), so it's insignificant that we'll have to wait one more year for Haswell. Also note that the HSA hardware won't be ready before 2014.

I see. It's insignificant that haswell won't be ready for another year, but not insignificant that full HSA hardware won't be ready for a year after that. Kaveri will be here next year with HSA compliant features, so no need to wait.

So that's not what's bothering me. They simply haven't specified how they'll achieve their claims. I'm sure it will be an improvement over their own previous technology (although Bulldozer wasn't very comforting), but how will it exceed the competition's technological leaps?

Haven't specified to who? intel?

Nothing prevents other CPU manufacturer's from creating an instruction set extension equivalent to AVX2. For instance ARM's NEON already has FMA, so the main missing feature is gather support. And since ILP and TLP are becoming exhausted the only parallelism left to exploit is DLP. So they should already be looking into widening the vectors as well.

Nothing except intel's market dominance. I can see that working out well.

HSA is open to the entire industry, and intel won't be the overlord they're accustomed to being. No wonder they have no interest in joining.

Clearly HSA doesn't benefit the entire industry. NVIDIA and Intel are not adopting it.

It's open to anyone who wants to adopt it. ARM for example, is. And they are the heavyweights in mobile and handheld. So it is by far more meaningful that ARM is aboard, than NV and intel not.

AVX2 will have a much bigger install base due to Intel's market dominance. AMD will eventually have to support it too (note that they already have AVX plus FMA), so the install base will grow even faster and it will be largely vendor agnostic.

Ahh, the good old x86 is your daddy paradigm. So what does 'largely' mean in that context? ARM doesn't have a large enough user base to be relevant? HSA supports both..
And, yes AMD will support AVX2 and FMA3 (FMA4 already supported, another example of how beneficial an intel monopoly is for the industry), and will be part of the HSA platform.

And as noted above other CPU manufacturers are very likely working on their own homogeneous throughput computing technology.

Or very likely not. Which CPU manufacturer for example might be working on their own homogeneous throughput technology?

It's also no less ISA agnostic than HSA, since the compatibility between adopters happens at the intermediate representation or high-level language level. Homogeneous computing even supports far more languages.

AVX2 is bound to x86, heterogeneous computing through HSA and HSAIL isn't.

The power efficiency of Intel's CPUs is also incredibly good, and will likely improve further with AVX-1024. And performance-wise it looks like the computing density of the CPU is getting really close to that of the GPU, while it has out-of-order execution and high cache hit rates to ensure high effective performance, without suffering from heterogeneous communication overhead.

Actually the computing density of the GPU is far outpacing that of the CPU, what are you talking about?

It's probably a good time to bring up Qualcomm where Eric Demers now works. Rumor has it they are set to release something special.

AtenRa · Jun 20, 2012

CPUarchitect said:
Not even close. Echelon is a research project for "extreme-scale" computing. A single node is specified to have 20 TFLOPS of computing power, and 256 GB of RAM. That's not going to be put into cell phones any time soon. It's clearly aimed only at the supercomputer market. I was talking about the consumer market, and NVIDIA has quite clearly taken a step back there to focus on graphics performance.

The bi-products or derivatives of this technology can be put to consumer devices such as Cell phones, Pads/Laptops (Win 8 or later on ARM) and all the way up to Super Computers and Servers.

It is of no coincidence that Jensen have announced Project Denver in CES (Consumer Electronics show)

Also, next year (maybe Q1 2013) Tegra 4 will be introduced with quad Cortex-A15 ARM cores and perhaps Kepler based iGPU.

Just because GK104 is not focused on GPGPU doesn't mean that NVIDIA is not pursuing more GPGPU oriented devices for the general consumer market.

bronxzv · Jun 20, 2012

piesquared said:
And, yes AMD will support AVX2 and FMA3

note that industry standard FMA (aka "FMA3") is already supported in Trinity

piesquared said:
Or very likely not. Which CPU manufacturer for example might be working on their own homogeneous throughput technology?

as mentioned already in this thread: clearly ARM with the NEON series of vector ISAs, another example will be the chinese ICube chips
http://www.extremetech.com/computin...on-the-upu-a-next-generation-cpu-architecture

piesquared said:
AVX2 is bound to x86, heterogeneous computing through HSA and HSAIL isn't.

HSA so far is more or less bound to Open CL, have a look here:

http://forums.anandtech.com/showpost.php?p=33572843&postcount=14

to see how some real users react

also note that with Open CL you can't count on all the hardware profiles in the user base to support basic things like doubles (64-bit FP) which are a given in all CPU ISAs since ages, talk me about a "compute language" again, even if your code use only 5% double / 95% float it's a major source of potential problems

Dresdenboy · Jun 20, 2012

bronxzv said:
HSA so far is more or less bound to Open CL, have a look here:

http://forums.anandtech.com/showpost.php?p=33572843&postcount=14

to see how some real users react

also note that with Open CL you can't count on all the hardware profiles in the user base to support basic things like doubles (64-bit FP) which are a given in all CPU ISAs since ages, talk me about a "compute language" again, even if your code use only 5% double / 95% float it's a major source of potential problems

The support of DP is kind of trickling down into lower end architectures. The more there is a need for it, the more there will also be support.

I see OpenCL as kind of lower level language put between really low level (shader language, CTM, x86 assembler etc.) and high level (C++11, Java etc.). AMP or Cilk bring such programming paradigms closer to the high level languages. So far it looks like the programmer should get away from direct OpenCL coding with HSA while being able to use the tools and libs. This would move OpenCL to the back end. At least this is the impression I've got.

Some articles and an interview might give a better overview than just the slides:
http://css.dzone.com/articles/heterogeneous-computing
http://css.dzone.com/articles/how-amds-heterogeneous-systems
http://www.anandtech.com/show/5847/...neous-and-gpu-compute-with-amds-manju-hegde/1

bronxzv · Jun 20, 2012

Dresdenboy said:
This would move OpenCL to the back end. At least this is the impression I've got.

me too, in other words it's pointless to port big legacy projects to Open CL, still the "L" in OpenCL stands for "language", if Open CL futures is no more than a VM/IL I'm not sure what it will have more to offer than LLVM for ex.

Dresdenboy said:
Some articles and an interview might give a better overview than just the slides:
http://css.dzone.com/articles/heterogeneous-computing
http://css.dzone.com/articles/how-amds-heterogeneous-systems
http://www.anandtech.com/show/5847/...neous-and-gpu-compute-with-amds-manju-hegde/1

thanks for the links, I'll have a look

CPUarchitect · Jun 20, 2012

piesquared said:
Specification have been finalized and handed to participating parties. Clearly you/intel are not one of those participating parties so i'm not sure why you would be so concerned.

You don't see why I'm concerned? Let's try this one more time. Where's the breakthrough in technology that can keep heterogeneous computing superior to homogeneous throughput computing?

It's swell that the specification has been finalized, but why the secrecy? If HSA is supposed to be an open standard and AMD wants to convince the majority of the industry to adopt it (including developers), they'd better reveal the magic ingredient.

Refer to Phil Hughes keynote if you have any real interest: https://vts.inxpo.com/scripts/Server.nxp

That's giving me a "PAGE ERROR c1-5".

I see. It's insignificant that haswell won't be ready for another year, but not insignificant that full HSA hardware won't be ready for a year after that. Kaveri will be here next year with HSA compliant features, so no need to wait.

I never said one was more significant than the other. I was just pointing out the facts. It's you who were arguing about timelines. And no matter how you spin it, you lost that argument. Even Kaveri's half-baked HSA implementation doesn't beat Haswell's availability. And I'm not even talking about how volume will affect the install base, or how AMD's eventual adoption of AVX2 affects it.

Nothing except intel's market dominance. I can see that working out well.

Would you care to elaborate? How is Intel's market dominance going to prevent others from implementing homogeneous throughput computing technology? There's no doubt AVX2 is coming to Atom, possibly as soon as Airmont. So why wouldn't ARM want equivalent technology to keep up?

HSA is open to the entire industry...

Except that "open" apparently means you have to join AMD's club before you get to see how they're planning on competing with homogeneous throughput computing. I don't see how this is any more open than others just licensing the technology and AMD giving them a discount in an attempt to create a bigger install base. That's not the same thing as an open standard.

It's open to anyone who wants to adopt it. ARM for example, is. And they are the heavyweights in mobile and handheld. So it is by far more meaningful that ARM is aboard, than NV and intel not.

Yes ARM is a heavyweight in mobile and handheld, but no it's not more meaningful. The mobile market is several years behind on the desktop market, both in terms of hardware performance and software complexity. So if HSA can't prove itself in the desktop market, it's extremely doubtful that it will gain a foothold in the mobile market. So actually you should turn that around: if homogeneous throughput computing prevails in the desktop market, that's a very big motivation for the mobile chip manufacturers to come up with an equivalent.

And, yes AMD will support AVX2 and FMA3 (FMA4 already supported, another example of how beneficial an intel monopoly is for the industry), and will be part of the HSA platform.

Which makes it an argument in favor of homogeneous throughput computing. Intel doesn't care how AVX2 will be adopted. Developers can write assembly, C++, OpenCL, HSA, etc. it's all a win. AMD madly depends on HSA to create added value for its hardware over that of the competition.

And I'm not sure what you're trying to get at with FMA3 versus FMA4. Intel recognized that AMD's FMA3 specification in SSE5 was superior to AVX's initial FMA4 specification. So Intel switched to FMA3. At the same time AMD apparently lost faith in its own specification and dropped FMA3. So what are you blaming Intel of? Going with the best technology and not warning AMD that they're making a mistake?

Or very likely not. Which CPU manufacturer for example might be working on their own homogeneous throughput technology?

Wrong question. Who wouldn't? Think about the ROI.

AVX2 is bound to x86, heterogeneous computing through HSA and HSAIL isn't.

AVX2 is bound to x86, but homogeneous computing is not. Heterogeneous computing through HSA on the other hand is bound to HSA adopters.

Actually the computing density of the GPU is far outpacing that of the CPU, what are you talking about?

NetBurst could only 4 floating-point operations per cycle, per core. Haswell will be able to do 32. Even though the transistor count increased, it's still a huge increase in computing density. Meanwhile the last few AMD architectures have lost computing density. Implementing the remaining HSA features will cost even more computing density.

It's probably a good time to bring up Qualcomm where Eric Demers now works. Rumor has it they are set to release something special.

Perhaps he left AMD to work on homogeneous throughput computing at Qualcomm?

CPUarchitect · Jun 20, 2012

AtenRa said:
The bi-products or derivatives of this technology can be put to consumer devices such as Cell phones, Pads/Laptops (Win 8 or later on ARM) and all the way up to Super Computers and Servers.

No. You can't apply supercomputing technology to a cell phone. The issue that supercomputers are facing is that the computing power per node is increasing faster than they can exchange data between them. So Echelon is all about low-latency high-bandwidth technology (caches and interconnections) at each step of the massive hierarchy. Cell phones will never have to deal with the same kind of physical distances.

Just because GK104 is not focused on GPGPU doesn't mean that NVIDIA is not pursuing more GPGPU oriented devices for the general consumer market.

That doesn't make any sense. How can they leave out their flagship consumer product?

Olikan · Jun 20, 2012

CPUarchitect said:
There's no doubt AVX2 is coming to Atom

what?

AtenRa · Jun 20, 2012

CPUarchitect said:
That doesn't make any sense. How can they leave out their flagship consumer product?

GK104 is not their Flagship Kepler chip, GK100/110 is.

CPUarchitect · Jun 20, 2012

Dresdenboy said:
The support of DP is kind of trickling down into lower end architectures. The more there is a need for it, the more there will also be support.

Absolutely, but the problem is that improving double-precision support inherently sacrifices graphics performance for a given die size or power consumption. So it's no wonder that all APUs to date have no double-precision support. Kaveri isn't likely to change that if they want to hit the claimed (single-precision) performance numbers and keep it affordable and reasonably power efficient. So even if double-precision does get adopted in the low-end markets after that, between that and the other HSA features there will be highly fragmented support. That's not going to help the adoption.

Some articles and an interview might give a better overview than just the slides:
http://css.dzone.com/articles/heterogeneous-computing
http://css.dzone.com/articles/how-amds-heterogeneous-systems
http://www.anandtech.com/show/5847/...neous-and-gpu-compute-with-amds-manju-hegde/1

Thanks, that second link provides a nice overview of the specific HSA "improvements":

C++ support for GPU computing
This is not correct. HSA will only support C++ AMP, which adds restrictions for GPU computing. Meanwhile homogeneous throughput computing retains the full C++ feature set, or any other language for that matter.
All system memory accessible by both CPU and GPU
This is also the case for homogeneous throughput computing technology, plus you get consistent latencies.
Unified address space (hence no separate CPU/GPU memory pointers)
This is also the case for homogeneous throughput computing technology, and you can have function pointers too.
GPU uses pageable system memory (hence accesses data directly in CPU domain)
This is also the case for homogeneous throughput computing technology, and it doesn't require any duplicate hardware.
GPU and CPU can reference caches of both
Homogeneous throughput computing simply avoids any unnecessary data movement between caches.
GPU tasks are context-switchable (esp. important to avoid touch interface lag -- contexts switch rapidly in heterogeneous environments)
Homogeneous throughput computing can switch much faster due to having much less context.

Dresdenboy · Jun 20, 2012

@AtenRa:
GK110 is HPC first, consumer graphics later.

piesquared · Jun 20, 2012

CPUarchitect said:
You don't see why I'm concerned? Let's try this one more time. Where's the breakthrough in technology that can keep heterogeneous computing superior to homogeneous throughput computing?

It's swell that the specification has been finalized, but why the secrecy? If HSA is supposed to be an open standard and AMD wants to convince the majority of the industry to adopt it (including developers), they'd better reveal the magic ingredient.

Exactly right, I do not see why YOU'RE concerned about heterogeneous computing. If intel wants the specs it can join HSA. Pretty simple. Of course, they would never do that unless for nefarious reasons like joining the OLPC board just to sabatoge the initiative..

That's giving me a "PAGE ERROR c1-5".

http://www.inxpo.com/events/amd/afds-d

I never said one was more significant than the other. I was just pointing out the facts. It's you who were arguing about timelines. And no matter how you spin it, you lost that argument. Even Kaveri's half-baked HSA implementation doesn't beat Haswell's availability. And I'm not even talking about how volume will affect the install base, or how AMD's eventual adoption of AVX2 affects it.

That is exactly what you were implying. That it doesn't matter what happens in the rest of the industry, because haswell will be out sometime in 2013.

Would you care to elaborate? How is Intel's market dominance going to prevent others from implementing homogeneous throughput computing technology? There's no doubt AVX2 is coming to Atom, possibly as soon as Airmont. So why wouldn't ARM want equivalent technology to keep up?

You must be joking. How many times have we seen intel block competition from bringing anything to market that might threaten their monopoly?

Except that "open" apparently means you have to join AMD's club before you get to see how they're planning on competing with homogeneous throughput computing. I don't see how this is any more open than others just licensing the technology and AMD giving them a discount in an attempt to create a bigger install base. That's not the same thing as an open standard.

I think that is the crux of the matter. Specs are finalized and handed to those that have an interest in advancing heterogenous computing not sabatoging it, as noted above.

Yes ARM is a heavyweight in mobile and handheld, but no it's not more meaningful. The mobile market is several years behind on the desktop market, both in terms of hardware performance and software complexity. So if HSA can't prove itself in the desktop market, it's extremely doubtful that it will gain a foothold in the mobile market. So actually you should turn that around: if homogeneous throughput computing prevails in the desktop market, that's a very big motivation for the mobile chip manufacturers to come up with an equivalent.

Wrong answer, predictable spin. The mobile market may be behind in performance, but heterogeneous computing changes that. It's recognize by all but those that have much to lose, that heterogeneous computing is the future and homogeneous computing is the past.

As to software complexity, I find that a bizare statement to make. So one of the advantages of desktop computing is complex software? How is that and advantage?? HSA and Bolt REDUCES software complexity.

Which makes it an argument in favor of homogeneous throughput computing. Intel doesn't care how AVX2 will be adopted. Developers can write assembly, C++, OpenCL, HSA, etc. it's all a win. AMD madly depends on HSA to create added value for its hardware over that of the competition.

How can developers write HSA, HSA is a platform that accomodates AMP, OpenCL, AVX2, ARM, etc.

And I'm not sure what you're trying to get at with FMA3 versus FMA4. Intel recognized that AMD's FMA3 specification in SSE5 was superior to AVX's initial FMA4 specification. So Intel switched to FMA3. At the same time AMD apparently lost faith in its own specification and dropped FMA3. So what are you blaming Intel of? Going with the best technology and not warning AMD that they're making a mistake?

Why the FMA difference? This was not something we did lightly. In December of 2008, Intel made significant changes to the FMA definition, which we found we could not accommodate without unacceptable risk to our product schedules. Yet we did not want to deprive customers of the significant performance benefits of FMA. So we decided to stick with the earlier definition, renaming it FMA4 (for four-operand FMA Intels newer definition uses what we believe to be a less capable three-operand, destructive-destination format). It will have a different CPUID feature flag from Intels FMA extension. At some future point, we will likely adopt Intels newer FMA definition as well, coexisting with FMA4. But as you might imagine, we may wait until were sure the specification is stable.

http://blogs.amd.com/developer/2009/05/06/striking-a-balance

Click to expand...

SSE5 supported 4 operand FMA. intel changed specifications of AVX to exclude the superior FMA4 because they're hardware never supported it.

Perhaps he left AMD to work on homogeneous throughput computing at Qualcomm?

Thinks so? Maybe it has something to do with GPGPU...

CPUarchitect · Jun 20, 2012

AtenRa said:
GK104 is not their Flagship Kepler chip, GK100/110 is.

I didn't say flagship Kepler chip, I said flagship consumer product. The point is that it makes no sense whatsoever to say that NVIDIA pursues "more GPGPU oriented devices" when leaving out the GTX 680 and GTX 690. And there is no indication that the lower models will have any better GPGPU efficiency.

Abwx · Jun 20, 2012

CPUarchitect said:
Also note that 3DNow! was adopted by both Cyrix and IDT, before it got obliterated by SSE.

Thanks to intel s market dominance , otherwise 3Dnow was ahead
of SSE on a technical basis.

One advantage of 3DNow! is that it is possible to add or multiply the two numbers that are stored in the same register. With SSE, each number can only be combined with a number in the same position in another register. This capability, known as horizontal in Intel terminology, was the major addition to the SSE3 instruction set.

A disadvantage with 3DNow! is that 3DNow instructions and MMX instructions share the same register-file, whereas SSE adds 8 new independent registers (XMM0 - XMM7.)
Because MMX/3DNow! registers are shared by the standard x87 FPU, 3DNow! instructions and x87 instructions cannot be executed simultaneously. However, because it is aliased to the x87 FPU, the 3DNow! & MMX register states can be saved and restored by the traditional x87 F(N)SAVE and F(N)RSTOR instructions. This arrangement allowed operating systems to support 3DNow! with no explicit modifications, whereas SSE registers required explicit operating system support to properly save and restore the new XMM registers (via the added FXSAVE and FXRSTOR instructions.)
The FX* instructions are an upgrade to the older x87 save and restore instructions because these could save not only SSE register states but also those x87 register states (hence which meant that it could save MMX and 3DNow! registers too).
On AMD Athlon XP and K8-based cores (i.e. Athlon 64), assembly programmers have noted that it is possible to combine 3DNow! and SSE instructions to reduce register pressure, but in practice it is difficult to improve performance due to the instructions executing on shared functional units.[9]

http://en.wikipedia.org/wiki/3DNow!

bronxzv · Jun 20, 2012

Abwx said:
Thanks to intel s market dominance , otherwise 3Dnow was ahead
of SSE on a technical basis.

ROTFL, like being only 64-bit (2 packed floats) vs 128-bit (4 packed floats) with SSE

AMD summit today; Kaveri cuts out the middle man in Trinity.

Elite Member

Senior member

Platinum Member

Golden Member

Senior member

Platinum Member

Senior member

Lifer

Senior member

Senior member

Golden Member

Lifer

Senior member

Golden Member

Senior member

Senior member

Senior member

Platinum Member

Lifer

Senior member

Golden Member

Golden Member

Senior member

Lifer

Senior member