AMD's next GPU uarch is called "Polaris"

ShintaiDK · Jan 4, 2016

AtenRa said:
Yea i believe AMD used one driver older than what its available today. But i dont see anything new for SW Battlefront in the latest driver that could make things better for the GTX950.

The driver AMD used was released on December 1st 2015 and the latest one was released on 21th of December 2015.

The latest AMD driver is 15.12 isn't it? Not 16.10.

359.06 was used, latest is 361.43.

Techhog · Jan 4, 2016

ShintaiDK said:
The latest AMD driver is 15.12 isn't it? Not 16.10.

359.06 was used, latest is 361.43.

I would think that a Polaris GPU would require an unreleased driver, though. That's just common sense.

And yeah, 361.43 was just GameWorks VR stuff. It wouldn't have affected the test in any significant way.

AtenRa · Jan 4, 2016

ShintaiDK said:
The latest AMD driver is 15.12 isn't it? Not 16.10.

Techhog just answered that.

ShintaiDK said:
359.06 was used, latest is 361.43.

As I have said, 359,06 was released on December the 1st and 361,43 was released on December the 21st and has no SW Battlefront performance upgrades.

freeskier93 · Jan 4, 2016

Techhog said:
I would think that a Polaris GPU would require an unreleased driver, though. That's just common sense.

And yeah, 361.43 was just GameWorks VR stuff. It wouldn't have affected the test in any significant way.

15.12 is just the overarching software package indicating certain feature sets. Each card within the 15.12 package has it's own display driver, so this Polaris card driver could be placed under the 15.12 version to indicate the overarching features but have it's own specific display driver.

The actual driver version for my R9 280, for example, is 15.300.1025.1001.

Technically the latest package version, as found in Radeon settings, is 15.30.1025.1001-151204a-296874C.

EDIT: I got confused in the discussion, they were using 16.10 beta driver for the Polaris card.

Kenmitch · Jan 4, 2016

Techhog said:
I would think that a Polaris GPU would require an unreleased driver, though. That's just common sense.

And yeah, 361.43 was just GameWorks VR stuff. It wouldn't have affected the test in any significant way.

How do we know AMD didn't use the older driver on purpose. It's possible there is still a mole left at AMD whom tipped off NVidia about the testing. I could see them purposely tweaking the driver for GTX 950 maximum performance per watt in SWB....Dang conspiracy theories!

MrTeal · Jan 4, 2016

16.10 Beta is a bit of a strange one though, given their naming convention. Is it maybe a typo and they meant 16.01?

Kenmitch · Jan 4, 2016

MrTeal said:
16.10 Beta is a bit of a strange one though, given their naming convention. Is it maybe a typo and they meant 16.01?

Hopefully a typo and not a hint at the release date.

Silverforce11 · Jan 4, 2016

zlatan said:
The primitive discard accelerator is the real deal here. It can give a huge minimum fps boost, especially on complex scenes.

Sounds like a very aggressive Z-Culling?

Silverforce11 · Jan 4, 2016

sontin said:
So, showing of working silicon for a product released in 6 months is impressive?

nVidia had never shown Maxwell prior the launch and yet Maxwell was impressive at launch.

Jesus man people find a reason to complaint about anything... if AMD didn't show working silicon you people would be saying its all talk no substance, no proof.. etc et al.

Side by side comparison, power meter up, press allowed to see the game in action for themselves, at the SAME settings. That is a very good sign for the new FF process, for a working product, that's not wood-screwed together.

Silverforce11 · Jan 4, 2016

Headfoot said:
Did any of you actually read the Anandtech article???

They specifically called out that it was capped at 60 fps. AMD also similarly in the last 6 months released a new driver with frame rate capping capabilities, and boasted about how it increased perf/watt. They picked a 950, not a 960, so that more of the chip would have to be spun up to higher frequencies to hit 60 (wider w/ lower clock = more power efficient, narrower with higher clock = less power efficient). The new chip can likely do more than 60fps in SWBF, and the 950 is probably nearing its cap, and this was intentionally chosen. Do the math.

Those numbers they showed are certainly true, but they've also picked the precisely best light to show it in. As anyone should expect from a products company marketing its product...

If they didn't cap it at 60 vsync, then performance per watt would NOT be comparable.

Imagine if one did 90 fps and other one did 60 fps, power use alone will be affected on the entire system.

ShintaiDK · Jan 4, 2016

Silverforce11 said:
If they didn't cap it at 60 vsync, then performance per watt would NOT be comparable.

Imagine if one did 90 fps and other one did 60 fps, power use alone will be affected on the entire system.

It would, because you would take the extra performance into account.

You cant make performance/watt claims on very selected workloads. Because that's not how the SKUs will work in real life.

Just like we dont limit Intel CPUs to AMD CPUs max performance and then show a 5 fold performance/watt delta. The same BS was done by NVidia, unless you believe the performance/watt is 3x higher in Watch Dog on a GTX980 contra GTX680.

AtenRa · Jan 4, 2016

Yea now people whining because they used 60fps cap. Well if they would not use the fps cap and Polaris was 50% faster they would say that Polaris is just a bigger GPU than GTX950 etc etc. Im sure they would find something negative to say

3DVagabond · Jan 4, 2016

Silverforce11 said:
Sounds like a very aggressive Z-Culling?

From what I read on HW.fr

hardware.fr by google translate said:
AMD shows first of all that the heart of the architecture has been improved for better energy efficiency. As NVIDIA has begun to do from the Kepler generation, we can assume that AMD will try not need a complex logic and greedy scheduling within the CU, where the behavior of After a certain statement is fully deterministic and can therefore be satisfied with a static schedule prepared at compile time. AMD is also talk of improved hardware schedulers, but this time we assume they do not refer to CU but front-end and global tasks initiated by the control processors. It is thus probably enhancements designed to support multi-engine Direct3D 12. There is also talk of new compression modes. It could be the ASTC compression, costly to implement (but this problem 14nm rule) and AMD and Nvidia had avoided so far, unlike the GPU SoC designers to where a few more transistors are never too expensive paid to save memory bandwidth and energy. Finally, AMD mentions a Primitive Discard Accelerator or an ejection system triangles masked the rendering pipeline. Remember, statistically, about half of the triangles of an object turning their backs to the camera and may be ejected from the rendering soon as this state is confirmed. Able to quickly make geometry can boost performance in real situations. Currently, the Radeon geometry engines are not able to perform this task faster than rendering a triangle, unlike GeForce who benefit to stand out in some scenes, especially when tessellation generates many hidden triangles. With Polaris, AMD should finally fill the gap, probably doubling the number of ejection motor engines primitive rasterization (Nvidia opted for a different approach by decentralizing some of the geometry processing but we do not expect that AMD follows this route).

Seems like the difference is being able to get rid of the unnecessary tri's before the next ones are generated. Thus making the actual number generated much smaller. As they say, this could be a huge boost for AMD that nVidia is already enjoying.

Silverforce11 · Jan 4, 2016

ShintaiDK said:
You cant make performance/watt claims on very selected workloads. Because that's not how the SKUs will work in real life.

I think they are only claiming perf/w in that specific game and not overall.

Ofc its cherry picked, if they ran with Project Cars they would look much worse.

Similar to AMD's Fury X PR slides vs 980Ti, they picked 4K, they picked a stock 980Ti, and for some reason they ran with AF on x0, resulting in 5-10% performance advantage.

And because they implemented a good feature in Frame-rate Target Control, they are gonna use it. When gamers want 60 fps, they can get a lot of efficiency gains. If their monitor supports higher hz, they can set that target to whatever they want.

The only interesting part here is the low-end Polaris is ~35W (minus total system power) with very good performance to boot. I don't care about the competition because it's not a fair comparison, 14ff vs 28nm. Because they don't have Pascal on hand to test, they test with what they have for that market segment, low-end, Gtx 950.

Techhog · Jan 4, 2016

Kenmitch said:
How do we know AMD didn't use the older driver on purpose. It's possible there is still a mole left at AMD whom tipped off NVidia about the testing. I could see them purposely tweaking the driver for GTX 950 maximum performance per watt in SWB....Dang conspiracy theories!

Considering the fact that the driver was released on the 21st, it's extremely likely that the systems and tests were already set in stone and it was too late to redo the tests. Anyone here is free to do a test on their own, though.

Azix · Jan 4, 2016

3DVagabond said:
From what I read on HW.fr

Originally Posted by hardware.fr by google translate
AMD shows first of all that the heart of the architecture has been improved for better energy efficiency. As NVIDIA has begun to do from the Kepler generation, we can assume that AMD will try not need a complex logic and greedy scheduling within the CU, where the behavior of After a certain statement is fully deterministic and can therefore be satisfied with a static schedule prepared at compile time. AMD is also talk of improved hardware schedulers, but this time we assume they do not refer to CU but front-end and global tasks initiated by the control processors. It is thus probably enhancements designed to support multi-engine Direct3D 12. There is also talk of new compression modes. It could be the ASTC compression, costly to implement (but this problem 14nm rule) and AMD and Nvidia had avoided so far, unlike the GPU SoC designers to where a few more transistors are never too expensive paid to save memory bandwidth and energy. Finally, AMD mentions a Primitive Discard Accelerator or an ejection system triangles masked the rendering pipeline. Remember, statistically, about half of the triangles of an object turning their backs to the camera and may be ejected from the rendering soon as this state is confirmed. Able to quickly make geometry can boost performance in real situations. Currently, the Radeon geometry engines are not able to perform this task faster than rendering a triangle, unlike GeForce who benefit to stand out in some scenes, especially when tessellation generates many hidden triangles. With Polaris, AMD should finally fill the gap, probably doubling the number of ejection motor engines primitive rasterization (Nvidia opted for a different approach by decentralizing some of the geometry processing but we do not expect that AMD follows this route).

Click to expand...

Seems like the difference is being able to get rid of the unnecessary tri's before the next ones are generated. Thus making the actual number generated much smaller. As they say, this could be a huge boost for AMD that nVidia is already enjoying.

I really doubt their first bit. Using the same mechanism nvidia used is a step backwards. What AMD has now is better as we have seen with async compute issues and VR latency. They would not go back on this at this point. If they would have done it they would have done it with 28nm. Not the only way to improve efficiency. They have been doing such since gcn 1.0

Nvidia might go the AMD route also because it obviously costs money to maintain the software scheduler

sm625 · Jan 4, 2016

This is the closest thing I could find. A 950 should be running somewhere around 80-85% of its max power consumption when capped at 60fps. Nvidia has "Adaptive VSync", so this should limit the power consumption on a 950 to around 80% of full power. The numbers AMD provided do make sense. I think its a fair test. The 950 appears to be drawing roughly 75 watts, and the Polaris card only about 25 watts. If they release that chip onto a 80W TDP card, it is going to be a frickin monster midrange 970 killer, at probably $199. And yet Nvidia will still sell more 970's....

3DVagabond · Jan 4, 2016

Azix said:
I really doubt their first bit. Using the same mechanism nvidia used is a step backwards. What AMD has now is better as we have seen with async compute issues and VR latency. They would not go back on this at this point. If they would have done it they would have done it with 28nm. Not the only way to improve efficiency. They have been doing such since gcn 1.0

Nvidia might go the AMD route also because it obviously costs money to maintain the software scheduler

Remember this is Google translate. I wouldn't hang on every word.

Silverforce11 · Jan 4, 2016

3DVagabond said:
Remember this is Google translate. I wouldn't hang on every word.

If the theory is correct it is a major enhancement for all geometry that are not visible. To skip it entirely at the start of the pipeline is a nice boost, as the current Z-culling only does it towards the end of the rendering pipeline (it still does a lot of work for no visual output).

It sounds like in prior GCN they could not resolve the problem of devoting compute resources to determine if the geometry is visible or not to skip rendering, as the actual task of determining that cause more latency than it ended up saving. But for Polaris they claim to have solved it.

MrTeal · Jan 4, 2016

sm625 said:
This is the closest thing I could find. A 950 should be running somewhere around 80-85% of its max power consumption when capped at 60fps. Nvidia has "Adaptive VSync", so this should limit the power consumption on a 950 to around 80% of full power. The numbers AMD provided do make sense. I think its a fair test. The 950 appears to be drawing roughly 75 watts, and the Polaris card only about 25 watts. If they release that chip onto a 80W TDP card, it is going to be a frickin monster midrange 970 killer, at probably $199. And yet Nvidia will still sell more 970's....

The test makes sense to me. The 950 is the slowest Nvidia card that can still hit 60FPS minimums. Since other than demoing the silicon the test was primarily about efficiency, using a 60FPS cap allows the workload to be the same between not only for the cards, but also the CPU and rest of the system. It could be argued that the 960 would have been more efficient, though it would probably not be a huge difference.

Vesku · Jan 4, 2016

Kenmitch said:
How do we know AMD didn't use the older driver on purpose. It's possible there is still a mole left at AMD whom tipped off NVidia about the testing. I could see them purposely tweaking the driver for GTX 950 maximum performance per watt in SWB....Dang conspiracy theories!

Sounds much more likely than not wanting to work through Christmas. /s

"They are using a 4 week old driver and not the one published a few days before Christmas!", this is seriously worth posting as a complaint?

As for the demo, AMD seems to properly be focusing on mobile in terms of future revenue. Radeon Technologies Group name push is consistent with a new focus on licensing and custom designs (consoles, embedded) matching AMD CEO Lisa Su's public statements. Adds some uncertainty to whether AMD plans to stick it out in the large die GPU space, they talked more about workstations than HPC in their Polaris introduction video.

Interestingly, one of them specifically mentioned "uplifting clock frequencies" compared to current GCN.

.vodka · Jan 4, 2016

Vesku said:
Interestingly, one of them specifically mentioned "uplifting clock frequencies" compared to current GCN.

GCN should now clock better due to the new 14/16nm process, but it's clear that nV gets a nice chunk of their performance on Maxwell thanks to clock speeds (we all know what the 980Ti can do when pushed).

If this new GCN revision can inherently clock higher while maintaining a respectable amount of perf/MHz like Maxwell it should be able to put up quite the fight. That little Polaris chip + GDDR5 put up a promising, positive show.

Also, as said before Pascal is getting back most of the things that were cut off from Maxwell for efficiency's sake. The situation will be more comparable on both sides of the fence this round. This one will be fun to watch... at least more than the last few years.

IllogicalGlory · Jan 4, 2016

ShintaiDK said:
AMD 14LPP, NVidia 16FF+.

Not exactly.

Anandtech said:
As for RTGs FinFET manufacturing plans, the fact that RTG only mentions FinFET and not a specific FinFET process (e.g. TSMC 16nm) is intentional. The group has confirmed that they will be utilizing both traditional partner TSMCs 16nm process and AMD fab spin-off (and Samsung licensee) GlobalFoundries 14nm process

http://www.anandtech.com/show/9886/amd-reveals-polaris-gpu-architecture/3

Silverforce11 · Jan 4, 2016

And it's not 14FF LPP, but a modification of the process aimed at enhanced performance. There's TWO separate 14ff node based on Samsung tech.

maddie · Jan 4, 2016

Kenmitch said:
How do we know AMD didn't use the older driver on purpose. It's possible there is still a mole left at AMD whom tipped off NVidia about the testing. I could see them purposely tweaking the driver for GTX 950 maximum performance per watt in SWB....Dang conspiracy theories!

You're having some fun. I guess not all here agree.

AMD's next GPU uarch is called "Polaris"

Lifer

Platinum Member

Lifer

Senior member

Diamond Member

Diamond Member

Diamond Member

Lifer

Lifer

Lifer

Lifer

Lifer

Lifer

Lifer

Platinum Member

Golden Member

Diamond Member

Lifer

Lifer

Diamond Member

Diamond Member

Golden Member

Senior member

Lifer

Diamond Member