Vega/Navi Rumors (Updated)

maddie · Sep 28, 2016

Head1985 said:
Always expect worst from AMD because you know-They dont have money they need.
Btw 8x shader engines=128Rops max.
6x shader engines=96Rops max.
4x shader engines =64Rops max.

But yeah it will be interesting.They NEED 6x or 8x shader engines.

I thought that AMD has used decoupled ROPs since Tahiti.

Why is it being assumed as a given, that 16 ROPs /shader engine is a fixed limit. I can't understand this reasoning.

Glo. · Sep 28, 2016

gamervivek said:
Well, it could be 3 as well, more likely actually.

If there are 3 ROP clusters in each Shader engine:
6 shader engine setup will result in 72 ROP design, 8 shader engine design will result in 96 ROP design.
If there are 2 ROP clusters in each Shader Engine:
6 shader engine will result in 48 ROP design, 8 shader engine design will result in 64 ROPs.

Actually, 2 ROPs appear to be more balanced for this small scale shader engines.

maddie said:
I thought that AMD has used decoupled ROPs since Tahiti.

Why is it being assumed as a given, that 16 ROPs /shader engine is a fixed limit. I can't understand this reasoning.

It depends on the space in the shader engine. Fiji Shader engines with 16 CU design can have more than 4 ROP clusters/shader engine.
8 CU design has maximum space for 3 ROP clusters per shader engine.

Compare:

Fiji XT.

Polaris 11.

maddie · Sep 28, 2016

Glo. said:
If there are 3 ROP clusters in each Shader engine:
6 shader engine setup will result in 72 ROP design, 8 shader engine design will result in 96 ROP design.
If there are 2 ROP clusters in each Shader Engine:
6 shader engine will result in 48 ROP design, 8 shader engine design will result in 64 ROPs.

Actually, 2 ROPs appear to be more balanced for this small scale shader engines.

It depends on the space in the shader engine. Fiji Shader engines with 16 CU design can have more than 4 ROP clusters/shader engine.
8 CU design has maximum space for 3 ROP clusters per shader engine.

Compare:

Fiji XT.

Polaris 11.

I don't think we should take those graphics as a scale diagram of the die. I think they are meant to be used as crude layouts for educating people to high level blocks. The space available is up to the designers. if you want X units, you arrange for it to have the space.

With that said, I still think that we should abandon previous usage as limits. Remember the initial 3 core CPU designs where many said that it would not work well, as it was not a power of 2. That turned out to be a false assumption.

AtenRa · Sep 28, 2016

Glo. said:
If there are 3 ROP clusters in each Shader engine:
6 shader engine setup will result in 72 ROP design, 8 shader engine design will result in 96 ROP design.
If there are 2 ROP clusters in each Shader Engine:
6 shader engine will result in 48 ROP design, 8 shader engine design will result in 64 ROPs.

Actually, 2 ROPs appear to be more balanced for this small scale shader engines.

It depends on the space in the shader engine. Fiji Shader engines with 16 CU design can have more than 4 ROP clusters/shader engine.
8 CU design has maximum space for 3 ROP clusters per shader engine.

Compare:

Fiji XT.

Polaris 11.

I dont believe ROPs are coupled to Shader Engines, and as maddie said above the space is not related to the number of ROPs that can be fitted.

Glo. · Sep 28, 2016

ROPs are not coupled to memory controllers but are connected directly to Shader Engines. Every diagram of GCN architecture shows this.

IllogicalGlory · Sep 28, 2016

gamervivek said:
AMD making an x2 card out of the smaller Vega is wishful thinking. 64CUs seems the max limit for GCN without restructuring most of the chip. Those Vega11 rumors were earlier than the new rumors of Vega 10,11,20 and navi.

The embedded parts show a 1.4Ghz boost freq. for Polaris 11 which bodes well for Vega chips.

http://www.anandtech.com/show/10710/amd-announces-embedded-radeon-e9260-e9550

I wonder if AT is mistaken and the embedded P11 is the full P11. 1024 shakers at 1.27Ghz also gets 2.5TFLOPS.

Glo. · Sep 28, 2016

IllogicalGlory said:
I wonder if AT is mistaken and the embedded P11 is the full P11. 1024 shakers at 1.27Ghz also gets 2.5TFLOPS.

http://wccftech.com/radeon-embedded-gpus-e9550-e9260-polaris/
Official materials from AMD say that the design is 14 CU one. Not 16 CU.

maddie · Sep 28, 2016

Glo. said:
ROPs are not coupled to memory controllers but are connected directly to Shader Engines. Every diagram of GCN architecture shows this.

If this is fixed, how can you cut shaders but keep the # of ROPS?

Fiji has 2816 [44 CU] and 2560 [40 CU] shaders but both have 64 ROPs.

There must be some sort of decoupling taking place for this to be possible.

edit:

Mistook what you said about engines, and thought you meant shaders.

Yes the ROP to shader engine ratio can be fixed for a design. but I'm arguing is that the 16 ROPs/ engine is not a given but a design compromise and can be whatever the designer thinks optimum.

Glo. · Sep 28, 2016

maddie said:
If this is fixed, how can you cut shaders but keep the # of ROPS?

Fiji has 2816 [44 CU] and 2560 [40 CU] shaders but both have 64 ROPs.

There must be some sort of decoupling taking place for this to be possible.

Look at the GPU diagrams

. ROPs are the ones that are insider the Shader Engine, on the left of the CU's

.

That is how it is possible

.

garagisti · Sep 28, 2016

railven said:
I'd agree with this. And frankly it makes it easier to see who are in what camp and who echos who. I'll stick to my "kudos" response when someone nails something (or makes a funny post! )

(That and I'm an anti-social fluff-n-nutter

Come on, be a good sport. i liked his post merely to annoy him.

garagisti · Sep 28, 2016

Bacon1 said:
They aren't losing money on Polaris. It's exactly where they said it was going to be priced back in January.

http://www.pcworld.com/article/3115...-470-inside-for-amd-polaris-mobile-debut.html

Pretty sure they also got the Apple desktop and laptop as well for its refresh coming up.

Sure companies would always like to make more, but they've stated that they wanted to bring Polaris and newer cards for cheaper and that it cost them much less and they got more dies because of the node shrink. Thus even though they were selling for less, they cost a lot less which doesn't mean they are skimping themselves.

Nvidia making hand over fist in profit margins isn't a good thing. It just shows how over priced their goods are.

Obviously its good for companies to make profits, but its not anywhere near what your underlined statement would show. They set out to sell Polaris for much cheaper because their goal was "VR for the masses". Meaning you had to take the previous $400 price point down into the $200s. They've talked about this for months before release, no idea why people assumed they wanted to sell Polaris for more.

A lot of people are forgetting that AMD's selling Polaris by bucketloads to OEM's... which is well, so long overdue that it is almost a new thing.

jpiniero · Sep 28, 2016

garagisti said:
A lot of people are forgetting that AMD's selling Polaris by bucketloads to OEM's... which is well, so long overdue that it is almost a new thing.

I don't know if you can say that; Notebook Polaris is still MIA too.

Bacon1 · Sep 28, 2016

jpiniero said:
I don't know if you can say that; Notebook Polaris is still MIA too.

They are in the Alienware 15/17 same as Pascal.

gamervivek · Sep 28, 2016

Glo. said:
If there are 3 ROP clusters in each Shader engine:
6 shader engine setup will result in 72 ROP design, 8 shader engine design will result in 96 ROP design.
If there are 2 ROP clusters in each Shader Engine:
6 shader engine will result in 48 ROP design, 8 shader engine design will result in 64 ROPs.

Actually, 2 ROPs appear to be more balanced for this small scale shader engines.

It depends on the space in the shader engine. Fiji Shader engines with 16 CU design can have more than 4 ROP clusters/shader engine.
8 CU design has maximum space for 3 ROP clusters per shader engine.

Compare:

Fiji XT.

Polaris 11.

I meant 3 shader engines because it's unlikely that a console has that big chip. Even though a 6 shader engine, 9/10 CUs per SE, 48ROPs with a 384-bit bus width sounds more natural. Maybe AMD will pare down the size by next year.

IllogicalGlory said:
I wonder if AT is mistaken and the embedded P11 is the full P11. 1024 shakers at 1.27Ghz also gets 2.5TFLOPS.

I though too so but AMD's official slide says that it's 14CUs. But Ryan mentioned on B3D that AMD can give wrong info about TFLOPs on marketing slides.

Piroko · Sep 28, 2016

Glo. said:
Look at the GPU diagrams . ROPs are the ones that are insider the Shader Engine, on the left of the CU's .

That is how it is possible .

Pretty sure that there's no artificial limit with ROPs or CUs inside of a shader engine in GCN. Those combinations are practical tradeoffs.
There might be a limitation with the amount of shader engines (4) in current GCN designs and that could be the big hurdle that needs to be tackled in Vega. That probably needs an overhaul of the command processor and intra-chip communication mesh.

Arachnotronic · Sep 30, 2016

Navi is a 7nm GPU, so don't expect it until 2019 at the earliest. Not sure what AMD was thinking when it put out a roadmap claiming Navi in 1H 2018...

railven · Sep 30, 2016

I've come to learn from various companies - road maps don't really mean much outside of "this is what we want to do and aim to do, but to be honest, this ain't gonna happen...or maybe...but not likely."

Intel, I'm looking at you!

Arachnotronic · Sep 30, 2016

railven said:
I've come to learn from various companies - road maps don't really mean much outside of "this is what we want to do and aim to do, but to be honest, this ain't gonna happen...or maybe...but not likely."

Intel, I'm looking at you!

yeah, Intel is pretty bad about this. Probably the worst offender, TBH.

raghu78 · Sep 30, 2016

Arachnotronic said:
Navi is a 7nm GPU, so don't expect it until 2019 at the earliest. Not sure what AMD was thinking when it put out a roadmap claiming Navi in 1H 2018...

AMD stated Navi in 2018 but never committed to any specific timeline. That leaves them with a lot of wiggle room as even if they launch in Dec 2018 they would meet it. GF has mentioned 7nm risk production in early 2018. Normally that would mean volume production begins 9-12 months later.

http://www.globalfoundries.com/news...performance-offering-of-7nm-finfet-technology

"GLOBALFOUNDRIES’ 7nm FinFET technology will be supported by a full platform of foundation and complex intellectual property (IP), including an application-specific integrated circuit (ASIC) offering. Test chips with IP from lead customers have already started running in Fab 8. The technology is expected to be ready for customer product design starts in the second half of 2017, with ramp to risk production in early 2018."

Anyway I think even in the best case scenario GF 7nm GPU products will be out in early 2019 and more like mid 2019 if you look at past execution. I think AMD's problem is going to be that Nvidia will have Volta on 16FF+ in early 2018 for which there does not seem to be a response. Vega might allow AMD to catch up to Pascal in performance (doubt they can catch up on efficiency). But Volta is going to cause a lot of headaches for AMD. Nvidia has lots of cash to develop multiple generations of a single process node while AMD struggle to develop and launch a full product stack based on a new process node.

garagisti · Sep 30, 2016

http://fudzilla.com/news/graphics/41724-vega-10-with-hbm2-to-launch-this-year
Vega 10 this year...

Phynaz · Sep 30, 2016

1. Ban for linking Fudzilla.
2. Announcement <> launch.

gamervivek · Sep 30, 2016

Arachnotronic said:
Navi is a 7nm GPU, so don't expect it until 2019 at the earliest. Not sure what AMD was thinking when it put out a roadmap claiming Navi in 1H 2018...

Curious if explicitly calling it 7nm Navi means that AMD might be trying it out on 14nm before as well or he just wants to emphasize the 7nm.

Det0x · Oct 6, 2016

WCC said:
Significantly Faster New GCN Architecture
The architecture has been considerably overhauled. It includes significant changes to the configuration and structure of SIMDs. Each SIMD is now capable of simultaneously processing variable length wavefronts. It also includes clever new coherency features to ensure peak stream processor occupancy in each compute unit at all times and reduce access times to cache and HBM. Memory delta color compression has also been improved. We’ll talk about the new architecture a lot more in-depth in a forthcoming article.

WCC said:
AMD’s most powerful GPU yet code-named Vega 10 is set to debut at the end of the year with Vega 11 following early next year. The company also has a new board code-named “Magnum” that will be showcased at SC 2016 this upcoming November. It’s not clear yet whether Vega 10 will make its debut alongside “Magnum” next month but several of our sources have confirmed that we’ll definitely see Vega 10 before year’s end.

Magnum is a unique chip, it features a matrix of logic blocks that can be configured and programmed individually for any desired application or program. In other words, it’s the company’s first ever FPGA and its greatest attempt yet to expand its penetration into the high performance embedded market.

http://wccftech.com/amd-vega-10-vega-11-magnum/

crisium · Oct 6, 2016

More hype!

Tread carefully. Do not forget what has happened in the past. A lot of "new this" and "new that" for Polaris simply meant it trades with Hawaii most of the time. Seeing as Vega 10 has equal shaders and ROPs and bandwidth as Fiji, rather than worse as Polaris 10 to Hawaii is, I'm expecting decidedly better than Fiji this time around but my range is not the moon.

Hope, but not hype for me.

IllogicalGlory · Oct 6, 2016

The RX 580 looks pretty great, but I would have hoped for an increase in memory bandwidth. I guess the new delta color compression will help. Hopefully it actually consumes 130W.

Vega/Navi Rumors (Updated)

Diamond Member

Diamond Member

Diamond Member

Lifer

Diamond Member

Senior member

Diamond Member

Diamond Member

Diamond Member

Senior member

Senior member

Lifer

Diamond Member

Senior member

Senior member

Lifer

Diamond Member

Lifer

Diamond Member

Senior member

Lifer

Senior member

Golden Member

Platinum Member

Senior member