[Videocardz] AMD Polaris 11 SKU spotted, has 16 Compute Units

Silverforce11 · Apr 8, 2016

HBM2 should hit those clocks, I think the reason it was lowered on P100, is due to the 300W TDP limit already. People don't want Teslas more power hungry than that as cooling for each slot in a huge cluster becomes difficult.

nurturedhate · Apr 8, 2016

96Firebird said:
Wow Russian, nice attack. I don't see anything in ShintaiDK's post predicting anything, he is simply posting fact. You've hit a new low there... :\

The only attack here is yours...

96Firebird · Apr 8, 2016

nurturedhate said:
The only attack here is yours...

I'd like to hear this... 😵

swilli89 · Apr 8, 2016

96Firebird said:
I'd like to hear this... 😵

Pointing out someone's consistently wrong and consistently negative posts are a way to check people into not doing those things. He didn't attack him; if someone is habitually wrong then yes, anything they "predict" should be taken with a grain of salt.

JDG1980 · Apr 8, 2016

Silverforce11 said:
The 800mhz was leaked earlier, ES base clock.

Could well be a mobile part, low power market.

Has to be. It's difficult to imagine that FinFET would lead to no clockspeed gains at all for AMD GPUs, when everyone else seems to be getting significant boosts in chips as different as Apple's A9 and Nvidia's GP100.

AMD will want to ensure that at least one desktop Polaris 11 SKU fits in under 75W, to provide the best possible performance for systems without a PCIe power connector. But that's the only reference point that really matters. Desktop users don't care about the difference between 40W and 60W TDP, and most would happily take that 50% increase in power consumption even if it only meant a 25% increase in performance. The same will be true of Polaris 10: at least one SKU should be under 150W so it can use just a single 6-pin connector, but above that, almost no one cares if it's 175W or 200W.

I wouldn't be surprised if we saw three different SKUs like we did with Fiji: full chip, cut chip, and full chip with low-power design. The fact that we are currently seeing GM206 and Bonaire cards designed to fit under the 75W limit shows there is a market for this kind of thing. IMO, the reason that the Nano did so poorly is that it foolishly set the power limit at 175W, which still requires an 8-pin connector. I think it would have done better at 150W with just a single 6-pin, even if performance had been a bit lower.

Killrose · Apr 9, 2016

JDG1980 said:
I wouldn't be surprised if we saw three different SKUs like we did with Fiji: full chip, cut chip, and full chip with low-power design. The fact that we are currently seeing GM206 and Bonaire cards designed to fit under the 75W limit shows there is a market for this kind of thing. IMO, the reason that the Nano did so poorly is that it foolishly set the power limit at 175W, which still requires an 8-pin connector. I think it would have done better at 150W with just a single 6-pin, even if performance had been a bit lower.

Hmmm... a dual bios card with micro switch still having an 8-pin and a 6-pin connector. One bios for use with both connectors and higher clocks. The other bios for only a single 6-pin connector having lower clocks. A simple upgrade path for some with only a capable PSU needed at a later date if they wished 🙂

tential · Apr 9, 2016

Surely it wasn't the Nano's $650 price tag at launch along with huge coil whien problems. No, it's that 25 watt difference that killed it.

C@mM! · Apr 9, 2016

I actually think the Nano is more popular now than near its launch. Prices have come down by a 1/3, whilst offering rather good performance and fitting in many mitx builds nicely.

Slaughterem · Apr 9, 2016

Something has to be the poster child for Zen APU. 16 CU in an APU with 4 gig HBM2 would probably be a nice system.

Flapdrol1337 · Apr 9, 2016

3.7 Tflops? So another 7970?

Was hoping for something faster. Although maybe it is, the gtx970 is only 3.9 Tflops after all.

3DVagabond · Apr 9, 2016

poofyhairguy said:
Yeah that memory bus means maybe people should start dialing back expectations, at least at high resolutions. Polaris 10 might be a 1080p monster though, and we all know the 970 made a killing because of that.

I am very interested in that Polaris 11. Looks like it will be cheap with only a 128 bus.

You wait. If nVidia finally has better performance at hires all of a sudden the market penetration for 1080 will no longer be important. It'll be, "Any serious gamer is going to have a (insert res nVidia wins at) monitor" 😀

Mahigan · Apr 9, 2016

Silverforce11 said:
I expect them to improve their "Boost" to be more smarter and easier to OC. This is good news.

The 800mhz was leaked earlier, ES base clock.

Could well be a mobile part, low power market.

Also, can we put all the threads on Polaris together, likewise for Pascal? This forum is getting very messy.

Edit: Read that patent quickly, very interesting, they are making each of their SP/SIMDs capable of running parallel instructions based on different types of workloads. That scheduler is basically hyper-threading the SPs individually, wow! Also optimization in their wavefront > SIMD, it can operate at peak efficiency at 4, 8, 16, 32, 64 threads. Lastly, each individual SIMD can be gated down or auto boost overclock based on the workload and power target. That is a lot of improvements, that ends up being less but more powerful and efficient SPs that handle bottlenecks much better. Very nice! Can't wait.

It's a game changer. I'm expecting AMD to decisively beat Pascal in terms of perf/watt and, with a smaller GPU, give larger Pascal GPUs a run for their money. Basically a reverse of what we say with Fiji/Grenada vs Maxwell.

Silverforce11 · Apr 9, 2016

3DVagabond said:
You wait. If nVidia finally has better performance at hires all of a sudden the market penetration for 1080 will no longer be important. It'll be, "Any serious gamer is going to have a (insert res nVidia wins at) monitor" 😀

You are so on the ball man. 🙂

Honestly, both sides do have valid points.

1080p is still the major gaming resolution target. Lots of gamers play on their TVs as well as monitors.

But on the enthusiast side, 1440p and 4K are very important segments for folks to actually spend big bucks for top of the line GPUs.

There's no reason for a GPU to really suck at either end of the spectrum, unless it's a weak GPU playing 1440p/4K etc. Fury X for example, blows at 1080p in most DX11 games, and it's because it's a very poor/unbalanced chip. AMD knows it, they even talked about it to Computerbase.de when the chip was released. It's a quick stop-gap, testing HBM tech.

ShintaiDK · Apr 9, 2016

thilanliyan said:
Shintai, as for no 2GHz HBM yet, is there some physical limitation for there to be none available by the time Vega launches early next year (ie. did anyone say it is not manufacturable)? How long between release of said HBM and launch would AMD need to incorporate it into their cards?

Silverforce11 said:
HBM2 should hit those clocks, I think the reason it was lowered on P100, is due to the 300W TDP limit already. People don't want Teslas more power hungry than that as cooling for each slot in a huge cluster becomes difficult.

HBM1 and HBM2 should be easy to retrofit. I think you could even retrofit Fiji with HBM2 if you wanted. Lower clocks perhaps, but density is what matters most. We already saw GP100 with both HBM1(early samples) and HBM2.

The clocks shouldn't be a surprise to anyone. Hynix initial target is 1.6Ghz. 2Ghz is the end target for HBM2 before it gets replaced by...HBM3?

But now we see 1.4Ghz from Samsung as a new bottom.

3DVagabond · Apr 9, 2016

Silverforce11 said:
You are so on the ball man. 🙂

Honestly, both sides do have valid points.

1080p is still the major gaming resolution target. Lots of gamers play on their TVs as well as monitors.

But on the enthusiast side, 1440p and 4K are very important segments for folks to actually spend big bucks for top of the line GPUs.

There's no reason for a GPU to really suck at either end of the spectrum, unless it's a weak GPU playing 1440p/4K etc. Fury X for example, blows at 1080p in most DX11 games, and it's because it's a very poor/unbalanced chip. AMD knows it, they even talked about it to Computerbase.de when the chip was released. It's a quick stop-gap, testing HBM tech.

It obviously depends a lot on what card you are looking at. The only card that has a performance advantage for nVidia is the 980 ti (and Titan X of course but the price premium is stupid). And even the 980 ti is pretty small. Add to that most people aren't buying the 980 ti for 1080p. The ones that they are buying for that res AMD performs better (perf/$).

antihelten · Apr 9, 2016

Mahigan said:
It's a game changer. I'm expecting AMD to decisively beat Pascal in terms of perf/watt and, with a smaller GPU, give larger Pascal GPUs a run for their money. Basically a reverse of what we say with Fiji/Grenada vs Maxwell.

So basically a return to the Tesla/Fermi vs. R700/Evergreen days?

Silverforce11 · Apr 9, 2016

antihelten said:
So basically a return to the Tesla/Fermi vs. R700/Evergreen days?

GP100 looks like it's within reach for Vega 10 to pwn TBH. Those specs not that impressive.

ShintaiDK · Apr 9, 2016

Silverforce11 said:
GP100 looks like it's within reach for Vega 10 to pwn TBH. Those specs not that impressive.

Based on what?

Det0x · Apr 9, 2016

ShintaiDK said:
Based on what?

Just a guess:

Silverforce11 said:
Depends how the new GCN turns out, but they have actually got Hyper-threading for SPs. For REAL!

http://forums.anandtech.com/showpost.php?p=38154409&postcount=19

^ There's a patent paper there for next-gen GCN. Take some time to read it, it's mind blowing stuff.

On paper, there's potential for 4x the throughput for each SP. Though I suspect that's under perfect scenario, but still, x1 to x2 (game load dependent) per SP performance vs older GCN SP is there on the table.

Polaris GCN has gone wide with each SP being able to run multiple threads in parallel, a feat that's pretty crazy when you realize the amount of synchronization it requires to keep the hardware scheduler aware of each ALU uptime, to keep the warp scheduler keeping it busy.

There's also SP independent power gating and clock boost, so if an SP is only running one thread, it will auto boost to finish the task quicker.

Insane changes TBH, more than I expected.

beginner99 · Apr 9, 2016

According to the slides Polaris 10 has 192Gb/s bandwith. That's about half of Hawaii. Even if you factor in color compression, that it not enough to explain the difference. Not a promising sign on performance. Unless they came up with even better compression. But then it questions the need for HBM even for vega. a 512-bit GDDR5x bus must sure still be cheaper than HBM. If 256-but GDDR5 is enough for Hawaii performance level, I don't see a need for HBM.

ShintaiDK · Apr 9, 2016

beginner99 said:
According to the slides Polaris 10 has 192Gb/s bandwith. That's about half of Hawaii. Even if you factor in color compression, that it not enough to explain the difference. Not a promising sign on performance. Unless they came up with even better compression. But then it questions the need for HBM even for vega. a 512-bit GDDR5x bus must sure still be cheaper than HBM. If 256-but GDDR5 is enough for Hawaii performance level, I don't see a need for HBM.

Exactly :thumbsup:

Adored · Apr 9, 2016

HBM saves power which can in turn be used to increase shader clocks.

antihelten · Apr 9, 2016

Silverforce11 said:
GP100 looks like it's within reach for Vega 10 to pwn TBH. Those specs not that impressive.

If I'm not mistaken the latest rumors has Vega 10 at 4096 shaders, which is the same as Fury X.

GP100 (with 56 SMs) should be about 60-70% faster than a stock Fury X, so Vega 10 would have to pick that up from increases in IPC and frequency, which seems like a bit too much imho. At best I could imagine Vega 10 gaining 30-40% on Fury X via IPC/frequency improvements.

Det0x · Apr 9, 2016

antihelten said:
If I'm not mistaken the latest rumors has Vega 10 at 4096 shaders, which is the same as Fury X.

GP100 (with 56 SMs) should be about 60-70% faster than a stock Fury X, so Vega 10 would have to pick that up from increases in IPC and frequency, which seems like a bit too much imho. At best I could imagine Vega 10 gaining 30-40% on Fury X via IPC/frequency improvements.

Post #44 in this very thead:

On paper, there's potential for 4x the throughput for each SP. Though I suspect that's under perfect scenario, but still, x1 to x2 (game load dependent) per SP performance vs older GCN SP is there on the table.

AtenRa · Apr 9, 2016

beginner99 said:
According to the slides Polaris 10 has 192Gb/s bandwith. That's about half of Hawaii. Even if you factor in color compression, that it not enough to explain the difference. Not a promising sign on performance. Unless they came up with even better compression. But then it questions the need for HBM even for vega. a 512-bit GDDR5x bus must sure still be cheaper than HBM. If 256-but GDDR5 is enough for Hawaii performance level, I don't see a need for HBM.

GTX780Ti
384bit with 336GB/s bandwidth

GTX980
256bit with 224GB/s bandwidth

I dont need to tell you which of the two is the fastest.

[Videocardz] AMD Polaris 11 SKU spotted, has 16 Compute Units

Lifer

Golden Member

Diamond Member

Golden Member

Golden Member

Diamond Member

Diamond Member

Member

Member

Golden Member

Lifer

Senior member

Lifer

Lifer

Lifer

Golden Member

Lifer

Lifer

Golden Member

Diamond Member

Lifer

Senior member

Golden Member

Golden Member

Lifer