AMD: 4XXX not great in OpenCL

Thread starter thilanliyan
Start date Dec 24, 2009

Toggle sidebar Toggle sidebar

T

thilanliyan

Lifer

Dec 24, 2009

#1

http://www.xtremesystems.org/forums/showthread.php?t=241759

http://forum.donanimhaber.com/m_36606267/tm.htm

Soulkeeper

Diamond Member

Dec 24, 2009

#2

I didn't see any benchmarks in there.
Sounds like they are just going off a quote.

V

Voo

Golden Member

Dec 24, 2009

#3

Wasn't that the same site that had several "fermi cards" lying around?

L

Lonyo

Lifer

Dec 24, 2009

#4

Soulkeeper said:
I didn't see any benchmarks in there.
Sounds like they are just going off a quote.

A quote made by ATI.
Seems a safe bet that ATI aren't going to trash talk their own parts for fun.

VirtualLarry

No Lifer

Dec 24, 2009

#5

It's known that ATI's 48xx series lacks the type of local memory cache that NV cards have, which is (partially) why their F@H performance is less. From what I've read, it's actually faster on ATI cards to compute some things twice, rather than compute them once and store the value to be retrieved later. Hence ATI cards score half as much as NV cards in F@H.

GodisanAtheist

Diamond Member

Dec 24, 2009

#6

ATI was probably hurting for money so bad after the HD 2xxx & 3xxx series (and AMD's Phenom flop) that they just needed a competitive part out without much R&D or future-proofing that could make them a quick buck. and hence the HD4xxx series was born.

F

Fox5

Diamond Member

Dec 24, 2009

#7

GodisanAtheist said:
ATI was probably hurting for money so bad after the HD 2xxx & 3xxx series (and AMD's Phenom flop) that they just needed a competitive part out without much R&D or future-proofing that could make them a quick buck. and hence the HD4xxx series was born.

These parts are designed 2 to 4 years before they ever hit market, it's unlikely the 4xxx series could have reacted very much at all to the 3xxx and Phenom series, and since the 4xxx series isn't very different from the 3xxx series, it's unlikely to be a reaction to the 2xxx series as well.

Besides that, DirectX 10.1 is future proofing.
As for OpenCL, well AMD had their own idea of what a compute language would look like, and nvidia's won out. The 4xxx cards can still offer competitive performance to nvidia's 8xxx/9xxx cards once double precision computations are used anyway. And the 5xxx series blows away nvidia's cards in single and double precision.

But yes, the 4xxx series' opencl implementation is basically emulating local storage, so algorithms that rely on it will run like crap. Still, it's not like the 4xxx is worthless for opencl, and at this point it's much better to have the widest install base for opencl possible, rather than limiting it only to cards that will run the most optimally.

Nemesis 1

Lifer

Dec 24, 2009

#8

Fox5 could you expand on tha statement right here.

As for OpenCL, well AMD had their own idea of what a compute language would look like, and nvidia's won out. The 4xxx cards can still offer competitive performance

Your not suggesting Open CL is a NIvidia thing are you ??? Its Apple> Not NV. C is for Cuda and the game is just starting. No winners Yet . Just Whinners

F

Fox5

Diamond Member

Dec 25, 2009

#9

Nemesis 1 said:
Fox5 could you expand on tha statement right here.

As for OpenCL, well AMD had their own idea of what a compute language would look like, and nvidia's won out. The 4xxx cards can still offer competitive performance

Your not suggesting Open CL is a NIvidia thing are you ??? Its Apple> Not NV. C is for Cuda and the game is just starting. No winners Yet . Just Whinners

OpenCL isn't made by nvidia, but it's not very different in design from CUDA. Nvidia certainly had a large influence on it, even if just leading by example.

evolucion8

Platinum Member

Dec 25, 2009

#10

GodisanAtheist said:
ATI was probably hurting for money so bad after the HD 2xxx & 3xxx series (and AMD's Phenom flop) that they just needed a competitive part out without much R&D or future-proofing that could make them a quick buck. and hence the HD4xxx series was born.

Totally wrong, the RV770 took 3 years to be designed, it isn't something pulled from under sleeves.

Fox5 said:
These parts are designed 2 to 4 years before they ever hit market, it's unlikely the 4xxx series could have reacted very much at all to the 3xxx and Phenom series, and since the 4xxx series isn't very different from the 3xxx series, it's unlikely to be a reaction to the 2xxx series as well.

I agree with you.

VirtualLarry said:
It's known that ATI's 48xx series lacks the type of local memory cache that NV cards have, which is (partially) why their F@H performance is less. From what I've read, it's actually faster on ATI cards to compute some things twice, rather than compute them once and store the value to be retrieved later. Hence ATI cards score half as much as NV cards in F@H.

I also heard that the protein used in ATi cards is much more complex than the ones used on nVidia hardware which loves tons of simple threads instead of the very large instructions that ATi arch loves, hence the double calculation.

I also think that ATi's OpenCL implementation it's in beta stage, I'm using the latest beta drivers dated this month and I'm still not able to run any OpenCL benchmark. So I don't know how speculation is made if there's not even benchmarks out there available, and in NGOHQ DirectCompute benchmark, the HD 4890 scores higher than the HD 5770. But we'll know that because of the global data share limitation in the HD 4800 series, it's quite hard to keep busy all it's execution resources, specially such a wide architecture as the RV7x0 series. Don't know much about hardware limitations, but love to read and learn.

S

Soleron

Senior member

Dec 25, 2009

#11

http://www.techreport.com/discussions.x/18201

That is a better article on the situation. Has a second AMD quote.

Villmow later qualified that response by saying, "[the Radeon HD 4870] just has to be programmed differently than the 5XXX series to get performance because of the lack of proper hardware local support. It is possible to get good performance, just not with a direct port from Cuda [Nvidia's GPU compute architecture]." He also stressed that AMD's compiler stack will include more device-specific optimizations as it matures.

GodisanAtheist

Diamond Member

Dec 25, 2009

#12

Fox5 said:
These parts are designed 2 to 4 years before they ever hit market, it's unlikely the 4xxx series could have reacted very much at all to the 3xxx and Phenom series, and since the 4xxx series isn't very different from the 3xxx series, it's unlikely to be a reaction to the 2xxx series as well.

Besides that, DirectX 10.1 is future proofing.
As for OpenCL, well AMD had their own idea of what a compute language would look like, and nvidia's won out. The 4xxx cards can still offer competitive performance to nvidia's 8xxx/9xxx cards once double precision computations are used anyway. And the 5xxx series blows away nvidia's cards in single and double precision.

But yes, the 4xxx series' opencl implementation is basically emulating local storage, so algorithms that rely on it will run like crap. Still, it's not like the 4xxx is worthless for opencl, and at this point it's much better to have the widest install base for opencl possible, rather than limiting it only to cards that will run the most optimally.

-I don't follow:

The 3xxx series was itself a card released to make up for the 2xxx series shortfalls (namely heat and price). You're effectively saying ATI was planning on including a ring-bus on its 2xxx series parts, only to ditch it one generation later? I don't think ATI dumped that kind of R&D into the project for a one off.

Yeah these cards are PLANNED 2-3 years in advance, but that doesn't mean the situation on the ground cannot or does not radically affect future product launches.

Take the 4xxx series orignal rumored specs of 480 SPs. That could very well have been ATI's original roadmap back before they released the 2xxx and saw it get clobbered. Something like SPs scale easily and very well, there is no reason ATI couldn't have done some shotgun cores for testing purposes and upped the SP count to be competitive with Nvidia.

evolucion8

Platinum Member

Dec 25, 2009

#13

GodisanAtheist said:
-I don't follow:

The 3xxx series was itself a card released to make up for the 2xxx series shortfalls (namely heat and price). You're effectively saying ATI was planning on including a ring-bus on its 2xxx series parts, only to ditch it one generation later? I don't think ATI dumped that kind of R&D into the project for a one off.

Well, you seems to forgot that the ringbus was initially used on the Radeon X1800 series and it's derivatives. It worked fine and later improved with fully distributable reads and writes, it gave the Radeon HD 2900XT/HD 3800 series with 85% of bandwidth efficiency. The main issue with it was because it consumes a lot of power and using a Hub based memory controller like the one used with the HD 4800 series gave almost a 100% bandwidth efficiency with much better power consumption and granularity.

Yeah these cards are PLANNED 2-3 years in advance, but that doesn't mean the situation on the ground cannot or does not radically affect future product launches.

You are correct, like the last minute change done with the HD 4850 before launch that it got it's VRAM amount doubled, plus higher core and RAM speeds.

Take the 4xxx series orignal rumored specs of 480 SPs. That could very well have been ATI's original roadmap back before they released the 2xxx and saw it get clobbered. Something like SPs scale easily and very well, there is no reason ATI couldn't have done some shotgun cores for testing purposes and upped the SP count to be competitive with Nvidia.

I heard that initially the HD 4800 series got the 480SP design as their target, but the architecture was done so nicely with die space saving techniques and stuff that the chip couldn't use a 256-bits BUS due to padding issues, so they stuffed the GPU with more SP so they could use a wider bus width. It's quite impossible to create a SKU to test it against their rival cards specially if its rival is designing the GPU and it isn't even out yet.

Soulkeeper

Diamond Member

Dec 25, 2009

#14

Lonyo said:
A quote made by ATI.
Seems a safe bet that ATI aren't going to trash talk their own parts for fun.

I guess my opinion is wrong again ....
Thanks for replacing it with your opinion.

We wouldn't want me to say anything on a forum without being quoted.

GodisanAtheist

Diamond Member

Dec 26, 2009

#15

evolucion8 said:
Well, you seems to forgot that the ringbus was initially used on the Radeon X1800 series and it's derivatives. It worked fine and later improved with fully distributable reads and writes, it gave the Radeon HD 2900XT/HD 3800 series with 85% of bandwidth efficiency. The main issue with it was because it consumes a lot of power and using a Hub based memory controller like the one used with the HD 4800 series gave almost a 100% bandwidth efficiency with much better power consumption and granularity.

You are correct, like the last minute change done with the HD 4850 before launch that it got it's VRAM amount doubled, plus higher core and RAM speeds.

I heard that initially the HD 4800 series got the 480SP design as their target, but the architecture was done so nicely with die space saving techniques and stuff that the chip couldn't use a 256-bits BUS due to padding issues, so they stuffed the GPU with more SP so they could use a wider bus width. It's quite impossible to create a SKU to test it against their rival cards specially if its rival is designing the GPU and it isn't even out yet.

-Sorry, forgot where the ringbus was introduced and when it was discontinued. I retract my statement.

You must log in or register to reply here.

Share:

Facebook X (Twitter) Reddit Tumblr WhatsApp Email Link

TRENDING THREADS

Discussion Intel current and future Lakes & Rapids thread
- Started by TheF34RChannel
- Jun 18, 2017
- Replies: 23K
CPUs and Overclocking
Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)
- Started by DisEnchantment
- Sep 29, 2022
- Replies: 10K
CPUs and Overclocking
Discussion Speculation: Zen 4 (EPYC 4 "Genoa", Ryzen 7000, etc.)
- Started by Vattila
- Oct 6, 2019
- Replies: 13K
CPUs and Overclocking
T
Discussion Intel Meteor, Arrow, Lunar & Panther Lakes Discussion Threads
- Started by Tigerick
- Aug 22, 2022
- Replies: 7K
CPUs and Overclocking
Discussion Apple Silicon SoC thread
- Started by Eug
- Nov 10, 2020
- Replies: 6K
CPUs and Overclocking

Top Bottom

This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.

Accept Learn more…