AMD's next GPU uarch is called "Polaris"

raghu78 · Jan 9, 2016

DeathReborn said:
I think it'll probably be a 1536 shader part, maybe cut down to 1408 for the demo/yield purposes. With GCN AMD nearly always needs more shaders vs the competition to compensate for the clock speed amongst other things, I can't really see that changing with Polaris.

I think the 110 sq mm die is most likely a 1024 sp part. A true next gen successor to the original HD 7770 (Cape Verde). AMD is trying to improve perf/sp as thats the key to being competitive against Nvidia Pascal. AMD realizes that there is a lot of room for shader efficiency improvement and thats why they have tackled it in Polaris. If AMD's perf/sp can get as close to possible to Nvidia's perf/cc then we have a good contest. I am looking forward to the products from both camps. It will be an interesting year. But Nvidia has a massive advantage as they have tremendous market share and mindshare. AMD's failures to compete during the Maxwell generation hurt its marketshare and brand badly. Pascal will be improving on the impressive Maxwell architecture. So AMD faces an uphill task.

Techhog · Jan 9, 2016

raghu78 said:
I think the 110 sq mm die is most likely a 1024 sp part. A true next gen successor to the original HD 7770 (Cape Verde). AMD is trying to improve perf/sp as thats the key to being competitive against Nvidia Pascal. AMD realizes that there is a lot of room for shader efficiency improvement and thats why they have tackled it in Polaris. If AMD's perf/sp can get as close to possible to Nvidia's perf/cc then we have a good contest. I am looking forward to the products from both camps. It will be an interesting year. But Nvidia has a massive advantage as they have tremendous market share and mindshare. AMD's failures to compete during the Maxwell generation hurt its marketshare and brand badly. Pascal will be improving on the impressive Maxwell architecture. So AMD faces an uphill task.

This isn't a major revision, though. I don't think that much will change in that area.

raghu78 · Jan 9, 2016

Techhog said:
This isn't a major revision, though. I don't think that much will change in that area.

Since AMD specifically mention improved shader efficiency we have to wait and see how much has improved. If AMD wants Polaris to compete with Nvidia Pascal then the most important area of improvement needs to be perf/sp which will directly affect perf/watt and perf/sq mm. GCN has to improve efficiency in a big way to stand a chance of competing.

http://www.pcper.com/reviews/Graphi...hnologies-Group-Previews-Polaris-Architecture

"How is Polaris able to achieve these types of improvements? It comes from a combination of architectural changes and process technology changes. Even RTG staff were willing to admit that the move to 14nm FinFET process tech was the majority factor for the improvement we are seeing here, something on the order of a 70/30 split. "

So I would not conclude that Polaris is a minor revision until we see actual reviews of the product.

Glo. · Jan 9, 2016

Raghu, IMO if we will take all info we have currently(test setup, with power draws) I think 1024 GCN core GPU is too small.

That bit about 70/30 Efficiency gains from new process/new architecture is pretty interesting.
http://www.samsung.com/semiconductor/foundry/about-samsung-foundry/
14 nm LPE offers 40% increased performance plus 60% smaller power consumption and 50% smaller die size compared to 28 LPP.
14 nm LPP offers additional 10% better performance and power consumption. The thing is this: it is compared to Samsung process. Not TSMC. That process If I remember correctly was more efficient than TSMC. Apple bought whole production of A7 on that process from Samsung, If I remember correctly.

So the differences may be bigger than we think. Also, if we will see that 70% smaller power consumption at the same speed accounts for 2.5 better efficiency...

I would bet that the small chip that we are discussing here has somewhere between 1792 and 2048 GCN cores.

MrTeal · Jan 9, 2016

Glo. said:
Raghu, IMO if we will take all info we have currently(test setup, with power draws) I think 1024 GCN core GPU is too small.

That bit about 70/30 Efficiency gains from new process/new architecture is pretty interesting.
http://www.samsung.com/semiconductor/foundry/about-samsung-foundry/
14 nm LPE offers 40% increased performance plus 60% smaller power consumption and 50% smaller die size compared to 28 LPP.
14 nm LPP offers additional 10% better performance and power consumption. The thing is this: it is compared to Samsung process. Not TSMC. That process If I remember correctly was more efficient than TSMC. Apple bought whole production of A7 on that process from Samsung, If I remember correctly.

So the differences may be bigger than we think. Also, if we will see that 70% smaller power consumption at the same speed accounts for 2.5 better efficiency...

I would bet that the small chip that we are discussing here has somewhere between 1792 and 2048 GCN cores.

It all depends on the actual die size, but that would be pretty massive scaling. 2048 is the same number as Tahiti, which was a 350mm^2 die. I'm expecting more along the lines of the claimed 2x scaling, and something like 1024 to 1280 shaders, depending on whether the die ends up closee to 100 or 120.

Vesku · Jan 9, 2016

Given how AMD leveraged Pitcairn I also think the chip that was demoed is basically a GCN4 version of Pitcairn at ~100-120mm2. 1280 shaders would even allow for some harvesting of 1024 and possibly even 768 shader SKUs.

Techhog · Jan 10, 2016

MrTeal said:
It all depends on the actual die size, but that would be pretty massive scaling. 2048 is the same number as Tahiti, which was a 350mm^2 die. I'm expecting more along the lines of the claimed 2x scaling, and something like 1024 to 1280 shaders, depending on whether the die ends up closee to 100 or 120.

You should be looking at Tonga, rather than Tahiti. (Not trying to make a point or anything; just pointing that out.)

JDG1980 · Jan 10, 2016

Techhog said:
This isn't a major revision, though. I don't think that much will change in that area.

It certainly looks like it will be a major revision. The CUs, geometry processor, command processor, L2 cache, and memory controller are all said by AMD to be new. (Of course that doesn't mean completely redone from scratch, but there should be improvements.) That's in addition to the changes we already know about to the multimedia and display sections (HEVC Main10, HDMI 2.0, DP 1.3, etc.) The presentation touts "improved shader efficiency", so we can't necessarily assume that, for instance, 1280 Polaris shaders will only provide Pitcairn-level performance.

I think there were quite a few goodies we were supposed to get as part of the cancelled 20nm GPU line which ended up being held over for FinFET. If we use the existing unofficial revisions (1.0, 1.1, 1.2) for the current GCN cards, then this would be "GCN 2.0" - probably the most substantial revision GCN has had to date.

stuff_me_good · Jan 10, 2016

I've always wondered that why AMD does not go for the same route as nvidia when designing future GPU's? I mean as it seems the less cores you have the more clocks you can crank up. I think AMD's approach the chip has too many smaller cores than nvidia and while the chip is fast and efficient at to certain point, for some reason the GPU does not scale up very well when adding more shader cores, just like we have seen with Fury. And because of so massive amount of small cores, it's harder to crank the frequency up.

Nvidia's approach seems to be more balanced in term of core size vs. clocks. They have managed to make fast cores that clock high even though individual core is bigger than AMD's.

USER8000 · Jan 10, 2016

stuff_me_good said:
I've always wondered that why AMD does not go for the same route as nvidia when designing future GPU's? I mean as it seems the less cores you have the more clocks you can crank up. I think AMD's approach the chip has too many smaller cores than nvidia and while the chip is fast and efficient at to certain point, for some reason the GPU does not scale up very well when adding more shader cores, just like we have seen with Fury. And because of so massive amount of small cores, it's harder to crank the frequency up.

Nvidia's approach seems to be more balanced in term of core size vs. clocks. They have managed to make fast cores that clock high even though individual core is bigger than AMD's.

Part of the problem is AMD is using hardware scheduling with GCN and not software scheduling which Nvidia used from Kepler onwards,so I expect the lower clockspeeds might be down to the additional hardware they need for each shader group.

maddie · Jan 10, 2016

stuff_me_good said:
I've always wondered that why AMD does not go for the same route as nvidia when designing future GPU's? I mean as it seems the less cores you have the more clocks you can crank up. I think AMD's approach the chip has too many smaller cores than nvidia and while the chip is fast and efficient at to certain point, for some reason the GPU does not scale up very well when adding more shader cores, just like we have seen with Fury. And because of so massive amount of small cores, it's harder to crank the frequency up.

Nvidia's approach seems to be more balanced in term of core size vs. clocks. They have managed to make fast cores that clock high even though individual core is bigger than AMD's.

Which path do I choose? I consider this a good example of the consequences of high level design choices.

You have to remember that Nvidia was in similar situation a few generations ago. They had complex, double clocked shaders and starting with Kepler jumped to the AMD path of many simpler shaders. They were facing the negative consequences of an earlier design decision.

zlejedi · Jan 10, 2016

It's also a matter of how much You can afford - AMD needs to have one architecture good for apus, consoles and professional market and stay relevant for years. When Nvidia can design 600 mm^2 gpu completely devoid of FP32 capabilities just for gaming market and subsegment of professional market while still mantaining production of slightly tweaked Kepler for professional users.

So the whole Maxwell vs GCN era is classic case of specialised solution outperforming more universal design.

Personally I prefer more specialised approach since I don't care about BOINC anymore and future proofing for x years ahead is pointless for my software buying habits of getting GOTY versions on steam sale long after release.

JDG1980 · Jan 10, 2016

stuff_me_good said:
I've always wondered that why AMD does not go for the same route as nvidia when designing future GPU's? I mean as it seems the less cores you have the more clocks you can crank up.

The shader count isn't the problem. GTX 980 (2048 shaders) and GTX 980 Ti (2816 shaders) can get boost clocks much higher than the AMD cards with the same shader counts. It's the design of the shader cores that causes the issues AMD is having. AMD's current GCN cores run most efficiently at clock speeds between 800-900 MHz, while Nvidia Maxwell cores can easily do 1100-1200 MHz without hogging too much power.

This is the #1 reason why AMD's perf/watt looks so bad. If you compare a first-generation Pitcairn 7850 card to even Maxwell, it stacks up surprisingly well, since it runs at GCN's sweet spot. But AMD had to compete with newer Nvidia chips, and without the funds for new 28nm designs, the only way they had was to crank up the core and memory clocks. This killed efficiency. Hawaii is known as being hot and power-hungry, but if it was run at 800-900 MHz instead of 1000+, it would be far, far more efficient. The FirePro W8100, a professional Hawaii card, was tested as only taking 188W at full load. That's nearly 100W less than the corresponding consumer card (R9 290), and while part of this is due to binning, the biggest factor is that the W8100 runs at a lower clock speed that GCN was actually designed for.

stuff_me_good said:
I think AMD's approach the chip has too many smaller cores than nvidia

AMD's cores are not "smaller". In many ways, they are actually more powerful than Nvidia's, especially when it comes to GPGPU. As others pointed out, they have hardware schedulers, whereas Nvidia has stuck with software scheduling for Kepler and Maxwell. (Fermi had a hardware scheduler, and was notorious for being hot, loud, and power-hungry. And Fermi, like GCN, was created as a compute-first design.)

stuff_me_good said:
and while the chip is fast and efficient at to certain point, for some reason the GPU does not scale up very well when adding more shader cores, just like we have seen with Fury.

Fury has other problems. It's bottlenecked by either the limited ROP count or by its front end, or both. This is evident in the fact that the 3584-core salvage chip is only a tiny bit inferior to the 4096-core full-fat version, as well as the fact that it barely beats Hawaii in some gaming applications.

sontin · Jan 10, 2016

JDG1980 said:
AMD's cores are not "smaller". In many ways, they are actually more powerful than Nvidia's, especially when it comes to GPGPU. As others pointed out, they have hardware schedulers, whereas Nvidia has stuck with software scheduling for Kepler and Maxwell. (Fermi had a hardware scheduler, and was notorious for being hot, loud, and power-hungry. And Fermi, like GCN, was created as a compute-first design.)

You should go back and read the Anandtech article again:

ith that said, in discussing Kepler with NVIDIAs Jonah Alben, one thing that was made clear is that NVIDIA does consider this the better way to go. Theyre pleased with the performance and efficiency theyre getting out of software scheduling, going so far to say that had they known what they know now about software versus hardware scheduling, they would have done Fermi differently. But whether this only applies to consumer GPUs or if it will apply to Big Kepler too remains to be seen.

http://www.anandtech.com/show/5699/nvidia-geforce-gtx-680-review/3

There is no real advantages in hardware over software scheduling.
And Maxwell is on par with AMD when it comes to compute.

iiiankiii · Jan 10, 2016

desprado said:
Did anyone see the difference
https://www.youtube.com/watch?v=oNA6fll2DDQ

Left panel color are more sharper and have more saturation (GTX 950) where as right side panel color are totally washed out which is equipped with amd next gen.

Green shades can distort vision sometimes......

3DVagabond · Jan 10, 2016

What would be real nice is if people in this thread who are speaking as authorities give their credentials. If you don't have any then please say what you are giving us is an uneducated guess. Not, "This is why X is the way it is, period", when you honestly don't have a clue and are just making it up or regurgitating something that someone who has half a clue said. It's beyond annoying to have to look at all of the BS just to hope to find a bit of actual information.

rgallant · Jan 10, 2016

desprado said:
Did anyone see the difference
https://www.youtube.com/watch?v=oNA6fll2DDQ

Left panel color are more sharper and have more saturation (GTX 950) where as right side panel color are totally washed out which is equipped with amd next gen.

should have used better drivers on that es sample maybe , they must have a few on file lol
also the power meters were different so I guess in your mind it's all bs.

stuff_me_good · Jan 10, 2016

First you say this...

JDG1980 said:
The shader count isn't the problem.

And then this...

Fury has other problems. It's bottlenecked by either the limited ROP count or by its front end, or both. This is evident in the fact that the 3584-core salvage chip is only a tiny bit inferior to the 4096-core full-fat version, as well as the fact that it barely beats Hawaii in some gaming applications.

While saying that that what I said was wrong, but in the end you just confirmed what I said before by yourself.

So yeah, GCN sweet spot is around 800MHz and in theory you could just cram more shaders on the core to compensate the low clocks, but that didn't work so well like we saw in fury.

Techhog · Jan 10, 2016

stuff_me_good said:
First you say this...

And then this...

While saying that that what I said was wrong, but in the end you just confirmed what I said before by yourself.

So yeah, GCN sweet spot is around 800MHz and in theory you could just cram more shaders on the core to compensate the low clocks, but that didn't work so well like we saw in fury.

No, you just misread what he was saying... unless you don't know what a ROP is? He's saying that there's another bottleneck at play.

Glo. · Jan 11, 2016

http://www.tweaktown.com/news/49581...-next-gen-polaris-spotted-ces-2016/index.html

What do ya think? R9 490X or something more?

Vesku · Jan 11, 2016

I think it's a good sign for AMD they have actual cards of at least two dies floating around while Nvidia is using Maxwell chips as stand ins for Pascal in their new car platform showcase. If they get little Polaris out soon enough they'll likely get a lot of sales over the Nvidia 750/950 before even having to face off against any FinFet competition. Let little Polaris generate some buzz and goodwill for a month or two then launch the next tier up.

Can't wait to see how AMD marketing messes this up, heh.

MrTeal · Jan 11, 2016

Glo. said:
http://www.tweaktown.com/news/49581...-next-gen-polaris-spotted-ces-2016/index.html

What do ya think? R9 490X or something more?

That is the worst description of something that should be a huge scoop I've ever seen a journalist make.

Was it just a big card? Was it a bare board? Size estimate on the GPU? Was it HBM or GDDR5? If HBM, how many stacks?

Eesh.

AMD's next GPU uarch is called "Polaris"

Diamond Member

Platinum Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Platinum Member

Golden Member

Senior member

Golden Member

Diamond Member

Senior member

Golden Member

Diamond Member

Senior member

Lifer

Golden Member

Senior member

Platinum Member

Diamond Member

Diamond Member

Diamond Member