[BitsAndChips]390X ready for launch - AMD ironing out drivers - Computex launch

JDG1980 · May 5, 2015

eRacer said:
Right now the 280X is not that far behind the GTX 780. If AMD did nothing to "full Tonga" other than switch to HBM and clock it like a 280X, it would have near GTX 780 performance at a reasonable TDP (although likely higher than 140W).

Why do you think that? The R9 M295X Mac Edition, a fully enabled Tonga, has an approximate TDP of 125W. That's currently shipping silicon. It runs at 850 MHz core and 1362 MHz memory clocks, which is lower than the desktop card but not by that much. The FirePro W7100 has the same cut-down Tonga as the R9 285, but a TDP of only 150W. When you consider the fact that the memory controller and GDDR5 takes up a substantial chunk of a graphic card's power consumption, a TDP of 140W for Tonga+HBM is not at all unrealistic. The mistake you're making is to assume that R9 285 is typical of Tonga's performance when in fact it is probably leaky trash silicon.

I hope they don't leave everything else as-is; I'd like to see HEVC decoding across the board. (I tested some 4K HEVC clips and was surprised at how CPU-hungry they are; my FX-8350 can't play them without dropping frames; all cores are pegged at or near 100%.)

JDG1980 · May 5, 2015

Silverforce11 said:
Nope. Because 280X, 7970Ghz & 285 (esp with Tonga's compression) architecture isn't bandwidth starved. Very few games even show scaling with vram OC.

So if its nothing but full Tonga with HBM, it would still be a turd.

Bandwidth may not be a bottleneck, but what about latency? Any tests indicating whether lower latency could make a difference?

And I don't know about you, but if AMD can make a card with performance greater than the R9 280X but only using 140W maximum and sporting hardware HEVC decoding, all for $229, then I'd find that a very appealing offering. It would beat the GTX 960 on raw performance and perf/$ while just about matching it on perf/watt.

maddie · May 5, 2015

JDG1980 said:
Bandwidth may not be a bottleneck, but what about latency? Any tests indicating whether lower latency could make a difference?

And I don't know about you, but if AMD can make a card with performance greater than the R9 280X but only using 140W maximum and sporting hardware HEVC decoding, all for $229, then I'd find that a very appealing offering. It would beat the GTX 960 on raw performance and perf/$ while just about matching it on perf/watt.

Actually it would beat the 960 on perf/watt also assuming equal to GTX780.
136% perf to 117% power

eRacer · May 6, 2015

Silverforce11 said:
eRacer said:

Right now the 280X is not that far behind the GTX 780. If AMD did nothing to "full Tonga" other than switch to HBM and clock it like a 280X, it would have near GTX 780 performance at a reasonable TDP (although likely higher than 140W).

Click to expand...

Nope. Because 280X, 7970Ghz & 285 (esp with Tonga's compression) architecture isn't bandwidth starved. Very few games even show scaling with vram OC.

So if its nothing but full Tonga with HBM, it would still be a turd.

The only way they are getting 780 class performance on a low end 370X part is with major uarch changes.

No major architectural changes needed, and the reason has nothing to do with bandwidth.

The Radeon 7970 GHz Edition is basically getting "GTX 780 class performance" right now according to this TechPowerUp review. Compare the performance of the Radeon 7970 GHz Edition and the GTX 780 at 2560x1440. The GTX 780 is less that 8.5% faster than the Radeon 7970 GHz Edition (129% / 119%). Even without HBM, full Tonga at the same clocks as the 7970 GHz Edition would be very close to the GTX 780 in overall performance in that review.

Silverforce11 · May 6, 2015

eRacer said:
Yep, and it has almost nothing to do with bandwidth.

The Radeon 7970 GHz Edition is basically getting "GTX 780 class performance" right now according to this TechPowerUp review. Compare the performance of the Radeon 7970 GHz Edition and the GTX 780 at 2560x1440. The GTX 780 is less that 8.5% faster than the Radeon 7970 GHz Edition (129% / 119%). Even without HBM, full Tonga at the same clocks at the 7970 GHz Edition would be very close to the GTX 780 in overall performance in that review.

I'm talking about the low-end 370X part, not the 380X class.

If that 370X is already 780 class performance, it means the uarch is significantly changed & improved.

It would stomp all over a Tonga rebadge 380X. Which is why its asinine to release a brand new uarch for the 370/X and 390/X but then make no change for a Tonga -> 380X.

This harks back to months ago when videocardz and other click bait sites were claiming AMD's entire lineup besides 390/X are re-badges (with GDDR5!) and full-Tonga is being rebadged into a 380X..

ps. Tonga with HBM is already a new uarch. It's again counter intuitive & wrong to say its a Tonga refresh or rebadge or re-whatever.

eRacer · May 6, 2015

Silverforce11 said:
I'm talking about the low-end 370X part, not the 380X class.

I wasn't talking about 380X either.

If that 370X is already 780 class performance, it means the uarch is significantly changed & improved.

And I already demonstrated full Tonga could reach GTX 780 class performance right now without HBM.

It would stomp all over a Tonga rebadge 380X. Which is why its asinine to release a brand new uarch for the 370/X and 390/X but then make no change for a Tonga -> 380X.

This harks back to months ago when videocardz and other click bait sites were claiming AMD's entire lineup besides 390/X are re-badges (with GDDR5!) and full-Tonga is being rebadged into a 380X..

I don't know why you keep bringing up 380X, because I specifically was comparing the 370X rumors to what we know about Tonga today.

ps. Tonga with HBM is already a new uarch. It's again counter intuitive & wrong to say its a Tonga refresh or rebadge or re-whatever.

I already stated that Tonga with a new memory controller wouldn't be considered a rebadge. I used the term "Tonga-like". If you don't like it, that's OK with me, but I'll still keep using it when/if I feel the need.

Silverforce11 · May 6, 2015

So if the 370X low end is 780 performance.. what relevance is there to make full Tonga + HBM to reach 780 performance?

Or what you're saying is that 370X = 2048 SP Tonga + HBM.

eRacer · May 6, 2015

Silverforce11 said:
So if the 370X low end is 780 performance.. what relevance is there to make full Tonga + HBM to reach 780 performance? Or what you're saying is that 370X = 2048 SP Tonga + HBM.

Going back to my original post, I said that 370/370X pricing of $179/$229 would be unsurprising because of where the Radeon 285 and others are priced today.

The Radeon 285 sells for as little as $210 on newegg, or $180 if you count the mail-in rebate. If we can buy a Tonga card now for as little $180 after rebate, it would not be shocking at all if rumored $179 Radeon 370 looked a lot like the cut down Tonga of Radeon 285 in terms of core count, die size, etc. It would also not be surprising to then see a Radeon 370X based on the same die as the Radeon 370 priced at $229. If Radeon 370X performed like a full Tonga, then it would be "GTX 780 class performance".

So if I had to guess I would say the 370X probably has 2048 SP like "full Tonga" and probably has performance/core similar to full Tonga, and would therefore have "GTX 780 class performance" just like a full Tonga would assuming it was clocked as high as the older 280X/7970 GHz Edition GPUs do. In order to reach 140W it would likely need HBM.

But that is merely a guess. Perhaps AMD was able to significantly increase clock speed in order to lower core count and reduce die size to lower costs. Or perhaps 370X is more densely populated and the 370X has even more cores but the roughly the same die size letting AMD lower the core clocks to improve energy efficiency. In those cases the architecture may not be Tonga-like at all, but the overall performance still would be.

3DVagabond · May 6, 2015

Astrallite said:
CLCs are pretty loud though compared to Nvidia's NVTTM cooler at least, especially at idle. Do you want a card that's as loud as the gaming profile on a typical stock cooler? A water pump can cause your entire case to resonate. If the 390X is only slightly faster than the Titan X alot of people are not going to bother. I mean the 295X2 has been as low as $500-range and people still don't want to touch it with 10 foot pole and Titan Xs are sold out still.

Titan X is not a particularly well cooled card.

It throttles after about 1 min.

Doesn't run really cool @ 83°C

RAM gets to over 100°C

The 295x2 is cooler and about the same noise while it's dissipating 2x the power (heat). An AIO on a ~250W card will be vastly superior in both cooling efficiency and noise to the Titan cooler.

ferrari100 · May 6, 2015

Rvenger said:
No, Let me be clear. Speculation and rumors are OK. Passing rumors off as facts is not OK, per the forum rules you need to cite the sources of your findings. If you have no source you can't claim it as a fact.

Also moderator callouts are not permitted here.

-Rvenger

Well that is awful.
lol

iiiankiii · May 6, 2015

Astrallite said:
CLCs are pretty loud though compared to Nvidia's NVTTM cooler at least, especially at idle. Do you want a card that's as loud as the gaming profile on a typical stock cooler? A water pump can cause your entire case to resonate. If the 390X is only slightly faster than the Titan X alot of people are not going to bother. I mean the 295X2 has been as low as $500-range and people still don't want to touch it with 10 foot pole and Titan Xs are sold out still.

While it's true CLC will be a bit louder at idle, the stock Tian X cooler is not more quiet than CLCs. No way. Also, let's not forget the r9 295x2 is a +500watt guzzler. That's going to require higher fan speeds to keep it cool. Let's not forget that the Titan X is at 83c and the r9 295x2 is at 73c during gaming. Heck, you can get the stock Titan X noise level on the r9 295x2 by adjusting the fan on the CLC to match the temperature of the Titan X.

I have no idea how you were able to clock your Titan X to 1389MHz with that stock cooler without increasing the fan speed. Yes, you can clock it that high, but it will throttle back down to to 1100ish Mhz within minutes. Unless you increase the fan speed on your Titan X, that speed is not sustainable. Effectively, you're better off leaving it at stock boost. Titan X owners are welcome to chime in. Can the Titan X sustain an overclock above 1200mhz on the stock fan speed? Based on numerous reviews, I haven't seen one able to.

If you had to increase the fan speed of the Titan X to get your overclocked speed to hold longer than a test run, you're whole argument falls apart. Once you crank that Titan X cooler past 50%, it gets loud. At that point, it's louder than the r9 295x2. Therefore, you should be okay with the noise level of the r9 295x2 even though you don't know it.

maddie · May 6, 2015

Check out this post and the linked pdf, table 2 on page 2.

This could explain some of the increased efficiency of R3xx series and why they would want HBM on all tiers. It's not just about power or bandwidth, it can also be computation.

http://forums.anandtech.com/showpost.php?p=37379459&postcount=967

exar333 · May 6, 2015

Kippa said:
Some people are saying that the 390X has to beat the Titan. I'm just curious, if the 390X underperforms against the Titan by say 10% or even slightly more, but is a few hundred dollars cheaper would you get it? I probably would. Don't get me wrong the Titan X is a great card but even with that performance the huge cost of the thing comes in to play.

There are a lot of areas to consider on the top end. Here are just a few:

-Performance
-Price
-Multi-GPU performance
-Overclock ability and scaling
-TDP
-Driver support

The Titan scores well on pure performance, GPU OC/scaling and TDP. AMD typically does better on price and multi-GPU lately...

I honestly think if the 390x is close to the Titan, and has similar OC capabilities, it could be a great win for AMD. On the other hand, even if it's within 10% of performance, but is much more poor on the OC front, it makes a much less desirable top-tier choice. Hopefully it can OC to the moon, as it certainly will not be memory-bandwidth starved with HBM.

maddie · May 6, 2015

Everyone missing this http://api.viglink.com/api/click?fo...ww.dongpingzhang.com/wordpre...SPC6-Zhang.pdf

A New Perspective on Processing-in-memory Architecture Design
Dong Ping Zhang, Nuwan Jayasena, Alexander Lyashevsky, Joseph Greathouse, Mitesh Meswani
Mark Nutter, Mike Ignatowski
Research Group, AMD, California, USA
dongping.zhang@amd.com

Here are some selected bits.
This is for some memory computations allowed to be done on the HBM stack.

Table 2: Normalized scaled vector add kernel performance on PIM
Baseline PIM config. 3 PIM config. 4
Normalized FLOPs used 1 0.27 0.27
Normalized memory bandwidth 1 2 4
Normalized execution time 1 0.51 0.26

Table 2 shows the performance of the weighted sum of vectors
kernel running on 4 GPU PIM processors normalized to the execution
on a high-performance GPU. Each PIM processor in config.
3 and 4 consist of one GPU with relative attributes normalized to
the host GPU configuration as shown in Table 2. To ensure a fair
comparison, we assume the baseline architecture also uses stacked
memory mounted on an interposer with the processor to achieve high
bandwidth.
[ this is in comparison to normal HBM not even GDDR5].
Other quotes:
Software Ecosystem
There are various issues to be considered around the interaction
between system software and PIM. One potential design point is to
assume that PIM devices are fully capable processors that can share
a single virtual address space with the host. This enables a path for
a range of multi-core OS capabilities to be adapted to in-memory
processors including virtual memory, preemptive multi-tasking, and
placement of tasks and memory objects in an environment supporting
non-uniform memory access. The application programming model
can potentially resemble existing models for heterogeneous multicore
architectures. These include standard CPU threading and tasking
models exposed via PIM-augmented APIs or additionally via extensions
to pragmas such as OpenMP. Evolutions of standards for
heterogeneous compute, such as OpenCL, may additionally enable
task- and data-parallel programming models. Under this scenario,
legacy applications can be adapted to PIM and execute correctly
with only minimal modifications.

4. Conclusion
As computation becomes increasingly limited by data movement
and energy consumption, exploiting locality throughout the memory
hierarchy becomes critical to maintaining the performance scaling
that many have come to expect from the computing industry. Emerging
3D die stacking technology provides an opportunity to exploit
in-memory computation in a practical manner. Our early evaluations
show promising results in the ability of this technique to address data
movement bottlenecks and overheads.
However, broad adoption of PIM can only occur if there are appropriate
programming models and languages along with adequate
compiler, runtime, and operating system support. Simultaneous advancement
in these aspects are yet to be addressed.

HSA is essential for this model to work.

Forget about the old bandwidth models. It seems that bandwidth is the cherry for HBM but memory processing using HSA and power reduction is the real value of HBM.

I would imagine a total driver overhaul with large parts being rewritten. How many here said drivers are trivial, etc and could not account for any delay.

MrTeal · May 6, 2015

iiiankiii said:
While it's true CLC will be a bit louder at idle, the stock Tian X cooler is not more quiet than CLCs. No way. Also, let's not forget the r9 295x2 is a +500watt guzzler. That's going to require higher fan speeds to keep it cool. Let's not forget that the Titan X is at 83c and the r9 295x2 is at 73c during gaming. Heck, you can get the stock Titan X noise level on the r9 295x2 by adjusting the fan on the CLC to match the temperature of the Titan X.

I have no idea how you were able to clock your Titan X to 1389MHz with that stock cooler without increasing the fan speed. Yes, you can clock it that high, but it will throttle back down to to 1100ish Mhz within minutes. Unless you increase the fan speed on your Titan X, that speed is not sustainable. Effectively, you're better off leaving it at stock boost. Titan X owners are welcome to chime in. Can the Titan X sustain an overclock above 1200mhz on the stock fan speed? Based on numerous reviews, I haven't seen one able to.

If you had to increase the fan speed of the Titan X to get your overclocked speed to hold longer than a test run, you're whole argument falls apart. Once you crank that Titan X cooler past 50%, it gets loud. At that point, it's louder than the r9 295x2. Therefore, you should be okay with the noise level of the r9 295x2 even though you don't know it.

My rig (2500k w/H80i and dual R9 290 w/Swiftech H220 in a Corsair Air 540) is comparatively loud at idle vs open air GPU and CPU coolers, but it's still not bad. The primary component of the noise is the pumps, so if you PWM control them based on temperature as well you can really lower idle noise.

At load, there's absolutely no comparison. The heat under my desk is more annoying than the small increase in sound level vs idle, whereas two 290s with blowers literally required noise canceling headphones to keep from missing quieter sounds in games.

2is · May 6, 2015

http://pclab.pl/art63572-7.html

This is something AMD really REALLY need to work on. Developer relations. This is a game that will unlock in a few hours time and there's no excuse for a 660Ti to outperform a 290x

Headfoot · May 6, 2015

pclab.pl is not a very reliable source

2is · May 6, 2015

Headfoot said:
pclab.pl is not a very reliable source

For AMD and AMD owners sake, I hope you're right. And for my own sake as well as I'd like to be able to play it in the PC in the other room with a 7970.

LTC8K6 · May 6, 2015

maddie, what's with the gigantic advertiser "viglink" url?

monstercameron · May 6, 2015

2is said:
http://pclab.pl/art63572-7.html

This is something AMD really REALLY need to work on. Developer relations. This is a game that will unlock in a few hours time and there's no excuse for a 660Ti to outperform a 290x

So Amd should pay someone to make a developers game work on their hardware?

2is · May 6, 2015

monstercameron said:
So Amd should pay someone to make a developers game work on their hardware?

AMD should do whatever they need to do so they aren't getting completely embarrassed when a new game is released, such as appears to be the case here.

Judging from your coy remark, I take it you think they should keep doing what they've been doing. So i'll follow up with a coy remark of my own...

So AMD should just keep the statuesque and continue to bleed market share and post losses quarter after quarter?

What you call "paying someone" is referred to as an investment. Work with developers so new games don't run like ass on your highest end hardware and scare potential buyers away from your product and into the arms of your competitor. It's hard to deny it's an effective business practice. We have two manufacturers, one does it, the other doesn't and the one that does is far more successful.

monstercameron · May 6, 2015

2is said:
AMD should do whatever they need to do so they aren't getting completely embarrassed when a new game is released, such as appears to be the case here.

Judging from your coy remark, I take it you think they should keep doing what they've been doing. So i'll follow up with a coy remark of my own...

So AMD should just keep the statuesque and continue to bleed market share and post losses quarter after quarter?

To each his own, I just think it is counter intuitive to blame amd for the developers shoddy programmers. I am glad dx12 will shake up the post launch support situation because there is no legitimate reason to ship a busted game save for niche uarchs like via s3 or powervr (on windows)

Azix · May 6, 2015

You also won't know which devs are going to mess up performance on your GPUs. AMD can't help everybody. Maybe they should try targeting those already on nvidias help, but I doubt they'd have easy access there.

Its false to assume amd is not working with Devs. They do, they just cannot work with everybody and neither can nvidia. The level of involvement in these big games also seems to fluctuate. Before AMD had tons of AAA games, now (granted the games have been slow to come) nvidia has maybe more.

tomb raider, dragon age, hardline, alien isolation etc etc. The very obvious problem is that AMD does not hurt nvidia when they help these games. So everybody cries foul when amd isn't part of a game and then they don't care when they are.

If you want amd to be like nvidia, thats our loss but maybe they would do better.

RussianSensation · May 6, 2015

2is said:
So pretty much confirmed AMD is sucking in this game and another example of "shoddy programming" that doesn't seem to affect the gpu manufacturer who works closesly with the developer.

This game was made specifically for NV again and is a GW title. What did you expect? Every single GW title is like this and has been since day 1. You can't deny the trends here that NV's GW's purpose is to sabotage AMD hardware/make NV hardware look better than it really is by virtue of inserting game code that AMD's driver can't optimize. You cannot optimize any code NV puts in there and neither can the developer without NV's permission.

Imagine if MS or Sony made a console but only 1st party developer had access to the most effective/lower level access code of XB1 and PS4 but everyone else couldn't? 3rd party developers would instantly be at an unfair disadvantage. Inserting black box game code that no one can alter and no one can optimize for except NV means NV is no longer just a hardware but a software developer because they literally ask the developer to insert NV's code that controls the shaders, lightning, physics effects, etc. I was always of the view there should be a separation of duties between software and hardware developers UNLESS the code was shared with everyone in the eco-system, so that everyone could benefit (that's the entire premise of x86 open source or Android). By means of financing/bribing developers, NV's GWs is alienating all customers who don't own NV hardware which leads to those customers being forced to get an NV card and that only gets us closer to a monopoly. It's startling how you aren't seeing this as one of the most ludicrous business practices that simply isn't allowed in almost any other industry sector of the economy I can think of. No car company in the world is allowed to pay off the world's leading car tire manufacturers or fuel manufacturers to produce tires and fuel that ONLY benefits them. That's monopolistic competition.

I would wait for AMD's drivers that will try to improve the other areas of the game they can optimize but no this game won't magically work faster on an R9 290X over a 970.

What's the most amazing part is so many sites refused to include Dirt Showdown into their testing during a period of time when NV couldn't get the driver to optimize for the game. That game used DirectCompute for various effects such as Global Illumination that specifically targeted GCN. As a result, naturally nV's performance bombed because initially they had no access to the code and couldn't optimize for the game. Major review sites removed this title from testing after seeing the results because it was clear the game was coded with a bias towards 1 AIB. However, both the AMD and the developer later shared the entire source code and NV was able to improve the game's performance without issues.

If NV truly thinks its products are superior to AMD's, why don't they share the source code with the world? They won't because AMD could quickly optimize the game and the entire premise behind GW is destroyed. If CF now requires a developer patch before it works in a GW's title, that means it's already proof enough that certain developers obstruct driver optimizations for one of the AIBs.

The irony and double standard is that the professional reviewers are not only NOT removing GW titles from their review testing but they keep putting more and more of them and blaming AMD's drivers on poor GW's performance(HardOCP as a prime example). It's like they forgot that 3 years ago they criticized Dirt Showdown for the very same thing and removed it. :whiste:

As NV gains more market share, they have even more control over review sites. If any site tries to be fair and go head-to-head with NV, give a negative review to a turd like the 960 or try to say something negative about NV's GWs publicly, NV will stop sending them review samples. With nearly 80% market share, NV can now do all of that since the review site is now at the mercy of NV or they'll have to buy all of the NV hardware on their own.

This also translates to AIBs. As NV gains more and more market share, AIBs have less negotiating power for margins, etc. An AIB can't just leave anymore and go with AMD because 80%+ of the discrete GPU market is NV. NV is nearly monopoly status now which is why we are starting to see this adversely making its way directly into game development. The more market share NV gains and the more earnings they get, the worse it's going to get really.

Many feared years ago that AMD GE or NV TWIMTPB could eventually turn into a game of who throws more money at developers to win and we are seeing this exactly in front of our eyes. That's not "investment in better business", this type of practice is not allowed in other industries -- it's called a bribe under the veil of marketing. TWIMTPB and AMD GE are not like this as all of the code was shared.

Since NV has more money, they can buy nearly every AAA title now and basically ensure AMD goes out of business. If none of this bothers you, that's fine as it's your choice but other people do understand what GWs is and why it's a cancer to PC gaming.

DiogoDX said:
Wow!!!

Even the ubisoft gamewoks titles din't had that difference between radeon and geforce. Congrats do SMS!

Notice how closely stacked R9 280/280X/290/290X are? Clearly the game has 0 optimization for AMD. You can't have 290 beating 280X by 3 fps but then 290X is 5 fps faster than the 290. That's not even remotely logical. :whiste:

PCLabs is pretty much a worthless site as they've been proven to skew GTA V results by favouring NV. That alone makes all their GPU testing irrelevant. PCGamesHardware has favoured NV for as long as I can remember too, pretty much almost every generation in the last 6 years.

How can both of those sites show 960 beating 290X but GameGPU has 290X nearly 50% faster than a 960?

http--www.gamegpu.ru-images-stories-Test_GPU-Simulator-Project_CARS_2015-test-pc_2560.jpg

Glo. · May 6, 2015

Im wondering if there will be any announcement today at AMD conference. It started right now, if anyone is interested:

http://edge.media-server.com/m/p/h9fua7ek?hootPostID=ae006ae49fb0af3e873aead5671d433a

[BitsAndChips]390X ready for launch - AMD ironing out drivers - Computex launch

Golden Member

Golden Member

Diamond Member

Member

Lifer

Member

Lifer

Member

Lifer

Banned

Senior member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Lifer

Diamond Member

Diamond Member

Diamond Member

Golden Member

Elite Member

Diamond Member