• Guest, The rules for the P & N subforum have been updated to prohibit "ad hominem" or personal attacks against other posters. See the full details in the post "Politics and News Rules & Guidelines."

Question Speculation: RDNA2 + CDNA Architectures thread

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

randomhero

Member
Apr 28, 2020
118
181
76
For reference Turing's takes up about 3% of the die. Then another 5-6% on Tensor Cores.
Are pathways and different cache structure also calculated in that area?
And I mean this as a question to anyone. I am quite curious about this in general.
 

Stuka87

Diamond Member
Dec 10, 2010
5,488
1,277
136
Are pathways and different cache structure also calculated in that area?
And I mean this as a question to anyone. I am quite curious about this in general.
It does not include pathways. Its the amount of die space taken by those cores themselves. Having the RT functionality built into the shader cores themselves is a much more efficient use of die space. And it may turn out to be more efficient from a performance standpoint too.
 

uzzi38

Golden Member
Oct 16, 2019
1,817
3,654
116
Are pathways and different cache structure also calculated in that area?
And I mean this as a question to anyone. I am quite curious about this in general.
Probably best you direct it to everyone, because I can't for the life of me remember now

Been months since I heard the number.
 

RetroZombie

Senior member
Nov 5, 2019
464
386
96
Shouldn't it the be like this: Navi 23 - 48 CUs, Navi 22 - 64 CUs, Navi 21 - 96 CUs?

Also: Navi 23 - 48 CUs, 192 bit GDDR6 bit bus, Navi 22 - 64 CUs, 256 bit GDDR6 bus, Navi 21 - 96 CUs 384 GDDR6 bit bus?

Shouldn't it be this way?
Well my math is failed i was using the 320mm2 not the real 360mm2 from the xsx.

But if the xsx with 56CUs and with a 320 bit imc and all the other extras is just 360mm2, how is navi22 with 340mm^2 just 64CUs with a smaller memory bus than the xsx?
It have to be more, like 80CUs, and the big one some 120CUs.

It seams a lot, but the xsx got so small that it would be impossible that amd own gpus would be much bigger, too bad we don't know the sony ps5 die size just to compare.
 

Glo.

Diamond Member
Apr 25, 2015
4,835
3,457
136
Well my math is failed i was using the 320mm2 not the real 360mm2 from the xsx.

But if the xsx with 56CUs and with a 320 bit imc and all the other extras is just 360mm2, how is navi22 with 340mm^2 just 64CUs with a smaller memory bus than the xsx?
It have to be more, like 80CUs, and the big one some 120CUs.

It seams a lot, but the xsx got so small that it would be impossible that amd own gpus would be much bigger, too bad we don't know the sony ps5 die size just to compare.
Riiight, and the small sub 200 mm2 is 48 CU? :))))

The math, assuming the CUs are 20% more transistor efficient might be correct and the counts are 240mm2 - 48 CUs, 64 for N22 and 96 for N21.

Safest bet after all is still: 40/60/80 CUs :).
 

RetroZombie

Senior member
Nov 5, 2019
464
386
96
Riiight, and the small sub 200 mm2 is 48 CU? :))))
Not sure maybe even less, the fixed stuff that have to exist in all models take too much space.
A great example of this is very big dies of the small nvidia gpus because of space that fixed stuff occupies.
 

TESKATLIPOKA

Senior member
May 1, 2020
450
440
96
Riiight, and the small sub 200 mm2 is 48 CU? :))))

The math, assuming the CUs are 20% more transistor efficient might be correct and the counts are 240mm2 - 48 CUs, 64 for N22 and 96 for N21.

Safest bet after all is still: 40/60/80 CUs :).
XBX size is only 361mm2 and that size includes 8 Zen cores, 56CU RDNA2, 320bit GDDR6 bus, southbridge I/O, sound and who knows what else, then I just can't understand why 340mm2 RDNA GPU would only have 60-64CU unless transistor density would be different between them.
 
Last edited:

Glo.

Diamond Member
Apr 25, 2015
4,835
3,457
136
XBX size is only 361mm2 and that size includes 8 Zen cores, 56CU RDNA2, 320bit GDDR6 bus, southbridge I/O, sound and who knows what else, then I just can't understand why 340mm2 RDNA GPU would only have 60-64CU unless transistor density would be different between them.
xTor/mm2 is EXACTLY the same between Zen 2 CPUs and RDNA1 GPUs for desktop and Xbox Series X's die.

But its so freaking outlandish to even think about the implications it brings for RDNA2.
 

TESKATLIPOKA

Senior member
May 1, 2020
450
440
96
I don't see what's so freaking outlandish with that. Do you have any better comparison? It's true we don't know the exact size of the graphic part in XBX and never will, but just removing the cores should decrease the size under 300mm2 and there are still other things which are not present in a graphic chip. So I think even 72CU is not out of question for Navi 22.
Even If I am wrong It won't be a problem, this thread is for speculation.
 

DisEnchantment

Senior member
Mar 3, 2017
908
2,374
136
Well... I have lots of time and I thought I might take a stab at this (measuring few blobs and pixelated die shots.) :tongueclosed: :tongueclosed:
I measured the various blocks for Navi 10 using the annotated die shots from Fritz and using ImageJ to measure the area to correlate and get the size of the blocks.

1588338250797.png

It turns out that total CU size for Navi10 is only 1/3 of its total die size and that a single CU (1/2 WGP) is quite small only ~2 mm2

So to speculate on the size for Big Navi... using Navi10 dimensions.
1.5x Memory+PHY (384Bit), 1x IO, 2x RBE(128ROPs), 4x L1, 2x ACE/HWS, 3x L2, 3x Raster/Primitive Unit and 2.5x WGP(100CUs) comes to about 500mm2. Keeping IO same, added 128bit more channel, doubled the ROP, L1, L2. Tripled the Raster

From Microsoft XSX die picture (which I have manipulated the perspective with GIMP), we take out the Zen2 Core, the PHYs etc, we have 175-185mm2 for the GPU cores if measured the same way as above.
ASSUMING 65% of the GPU core part (outside of IO and PHYs) is for the CUs (using the ratio of Navi 10) we have around ~2.2 mm2 for each CU. This could be because it seems XSX is fairly space efficient even when the HW RT is inside the CU.

So based on this, I could speculate that if the die size of 505 +/- 5% is correct,
1.5x
Memory+PHY (384Bit), 1x IO, 2x RBE(128ROPs), 4x L1, 2x ACE/HWS, 3x L2, 3x Raster/Primitive Unit and 2.5x WGP(100CUs) comes to about 505mm2.
~100 CUs could indeed be possible, with double the ROPs, 384 Bit GDDR and so on.

And that is 25 mins of of excel math and GIMP/ImageJ skills put to speculation use.

Block measurements from ImageJ and Perspective changed image with GIMP
1588331105749.png
1588328918598.png
 
Last edited:

TESKATLIPOKA

Senior member
May 1, 2020
450
440
96
Great job. I don't think we would see 128ROPs with just 384Bit bus, not to mention 50% more bandwidth won't be enough to feed a 100CU monster. 512Bit or HBM2 is the only possibility in my opinion. And CU will be bigger because of RT functionality, but still Interesting calculations.
BTW Isn't the picture from my link I posted in another thread showing Navi 10?
 
Last edited:

TESKATLIPOKA

Senior member
May 1, 2020
450
440
96
:D
I was happy when I found that picture. My thanks goes to Sashleycat for his great job on the picture and to Fritzchen Fritz for the picture.
 

Glo.

Diamond Member
Apr 25, 2015
4,835
3,457
136
Well... I have lots of time and I thought I might take a stab at this (measuring few blobs and pixelated die shots.) :tongueclosed: :tongueclosed:
I measured the various blocks for Navi 10 using the annotated die shots from Fritz and using ImageJ to measure the area to correlate and get the size of the blocks.

View attachment 20419

It turns out that total CU size for Navi10 is only 1/3 of its total die size and that a single CU (1/2 WGP) is quite small only ~2 mm2

So to speculate on the size for Big Navi... using Navi10 dimensions.
1.5x Memory+PHY (384Bit), 1x IO, 2x RBE(128ROPs), 4x L1, 2x ACE/HWS, 4x L2, 3x Raster/Primitive Unit and 2.5x WGP(100CUs) comes to about 500mm2. Keeping IO same, added 128bit more channel, doubled the ROP, L1, L2. Tripled the Raster

From Microsoft XSX die picture (which I have manipulated the perspective with GIMP), we take out the Zen2 Core, the PHYs etc, we have 175-185mm2 for the GPU cores if measured the same way as above.
ASSUMING 65% of the GPU core part (outside of IO and PHYs) is for the CUs (using the ratio of Navi 10) we have around ~2.2 mm2 for each CU. This could be because it seems XSX is fairly space efficient even when the HW RT is inside the CU.

So based on this, I could speculate that if the die size of 505 +/- 5% is correct,
1.5x
Memory+PHY (384Bit), 1x IO, 2x RBE(128ROPs), 4x L1, 2x ACE/HWS, 4x L2, 3x Raster/Primitive Unit and 2.5x WGP(100CUs) comes to about 505mm2.
~100 CUs could indeed be possible, with double the ROPs, 384 Bit GDDR and so on.

And that is 25 mins of of excel math and GIMP/ImageJ skills put to speculation use.

Block measurements from ImageJ and Perspective changed image with GIMP
Does all of this take into account RDNA2 having 20% more transistor efficient/Dense CUs, and that Caches are redesigned, again?

I don't see what's so freaking outlandish with that. Do you have any better comparison? It's true we don't know the exact size of the graphic part in XBX and never will, but just removing the cores should decrease the size under 300mm2 and there are still other things which are not present in a graphic chip. So I think even 72CU is not out of question for Navi 22.
Even If I am wrong It won't be a problem, this thread is for speculation.
Why is it outlandish?

Because it creates a consequence that 240 mm2 die is 48 CU chip, with 192 Bit GDDR6 that replaces... RX 5500 XT, and is the rumored "Nvidia Killer", per sources of RedGamingTech.

You see why is it outlandish? I cannot believe in this scenario.
 

TESKATLIPOKA

Senior member
May 1, 2020
450
440
96
Why is it outlandish?

Because it creates a consequence that 240 mm2 die is 48 CU chip, with 192 Bit GDDR6 that replaces... RX 5500 XT, and is the rumored "Nvidia Killer", per sources of RedGamingTech.

You see why is it outlandish? I cannot believe in this scenario.
I can imagine such a chip but I don't believe they would sell It for the same price as 5500 XT when It should have ~2x better perforrmance.
 
  • Like
Reactions: Tlh97 and joesiv

CastleBravo

Member
Dec 6, 2019
119
271
96
Great job. I don't think we would see 128ROPs with just 384Bit bus, not to mention 50% more bandwidth won't be enough to feed a 100CU monster. 512Bit or HBM2 is the only possibility in my opinion. And CU will be bigger because of RT functionality, but still Interesting calculations.
BTW Isn't the picture from my link I posted in another thread showing Navi 10?
I still think they will have a 384bit GDDR6 controller as well as an HBM2 controller on the die. Active CU count will probably be around 80 for the top end HBM GPU with maybe a few more than that on the die. Any defect in the GDDR controller shouldn't affect the yield for the top end HBM GPU model. If the top end model has some disabled CUs, it could withstand a defect there as well. Any chip with a more significant CU defect, or a problem with the HBM controller would be used for the GDDR6 GPU. In a perfect world, this might add up to an ~80 CU 505mm^2 monster GPU that doesn't need to be priced north of a grand. Of course, in the real world it just means more profit for AMD.
 
Last edited:

randomhero

Member
Apr 28, 2020
118
181
76
Well... I have lots of time and I thought I might take a stab at this (measuring few blobs and pixelated die shots.) :tongueclosed: :tongueclosed:
I measured the various blocks for Navi 10 using the annotated die shots from Fritz and using ImageJ to measure the area to correlate and get the size of the blocks.

View attachment 20421

It turns out that total CU size for Navi10 is only 1/3 of its total die size and that a single CU (1/2 WGP) is quite small only ~2 mm2

So to speculate on the size for Big Navi... using Navi10 dimensions.
1.5x Memory+PHY (384Bit), 1x IO, 2x RBE(128ROPs), 4x L1, 2x ACE/HWS, 3x L2, 3x Raster/Primitive Unit and 2.5x WGP(100CUs) comes to about 500mm2. Keeping IO same, added 128bit more channel, doubled the ROP, L1, L2. Tripled the Raster

From Microsoft XSX die picture (which I have manipulated the perspective with GIMP), we take out the Zen2 Core, the PHYs etc, we have 175-185mm2 for the GPU cores if measured the same way as above.
ASSUMING 65% of the GPU core part (outside of IO and PHYs) is for the CUs (using the ratio of Navi 10) we have around ~2.2 mm2 for each CU. This could be because it seems XSX is fairly space efficient even when the HW RT is inside the CU.

So based on this, I could speculate that if the die size of 505 +/- 5% is correct,
1.5x
Memory+PHY (384Bit), 1x IO, 2x RBE(128ROPs), 4x L1, 2x ACE/HWS, 3x L2, 3x Raster/Primitive Unit and 2.5x WGP(100CUs) comes to about 505mm2.
~100 CUs could indeed be possible, with double the ROPs, 384 Bit GDDR and so on.

And that is 25 mins of of excel math and GIMP/ImageJ skills put to speculation use.

Block measurements from ImageJ and Perspective changed image with GIMP
Wow man! You're on a roll!
This is like 3rd or 4th great post in last two days!
Well done, well done!
 

GodisanAtheist

Diamond Member
Nov 16, 2006
3,361
1,914
136
I can imagine such a chip but I don't believe they would sell It for the same price as 5500 XT when It should have ~2x better perforrmance.
- I figure current RDNA parts aren't going anywhere, they'll either be rebranded downward with a drop in price or RDNA 2 will stack on top of them if they maintain their price.

I think it really depends on where NV is in all the launch proceedings. If their RTX 3xxx series are still several months out from the RDNA2 launch, then AMD will go with the higher margins (so they can slash prices on RTX 3xxx launch), if not and they are closer than they appear, then they will make the more competitive play.
 

uzzi38

Golden Member
Oct 16, 2019
1,817
3,654
116
- I figure current RDNA parts aren't going anywhere, they'll either be rebranded downward with a drop in price or RDNA 2 will stack on top of them if they maintain their price.

I think it really depends on where NV is in all the launch proceedings. If their RTX 3xxx series are still several months out from the RDNA2 launch, then AMD will go with the higher margins (so they can slash prices on RTX 3xxx launch), if not and they are closer than they appear, then they will make the more competitive play.
If there is a 240mm^2 RDNA2 GPU, the entire RDNA1 lineup is dead. All in one go. Worthless.
 
  • Like
Reactions: Tlh97 and blckgrffn

ASK THE COMMUNITY