Question Speculation: RDNA2 + CDNA Architectures thread

uzzi38 · Apr 28, 2020

All die sizes are within 5mm^2. The poster here has been right on some things in the past afaik, and to his credit was the first to saying 505mm^2 for Navi21, which other people have backed up. Even still though, take the following with a pich of salt.

Navi21 - 505mm^2

Navi22 - 340mm^2

Navi23 - 240mm^2

Source is the following post: https://www.ptt.cc/bbs/PC_Shopping/M.1588075782.A.C1E.html

Glo. · Apr 30, 2020

DisEnchantment said:
Cache config could indeed be different.

Ding, Ding, Ding.

randomhero · Apr 30, 2020

DisEnchantment said:
RDNA2's RT HW is inside the CU. 4 each in one CU.
Cache config could indeed be different.

Yes, I'm aware of that. It still takes transistors.

uzzi38 · Apr 30, 2020

randomhero said:
Yes, I'm aware of that. It still takes transistors.

For reference Turing's takes up about 3% of the die. Then another 5-6% on Tensor Cores.

randomhero · Apr 30, 2020

uzzi38 said:
For reference Turing's takes up about 3% of the die. Then another 5-6% on Tensor Cores.

Are pathways and different cache structure also calculated in that area?
And I mean this as a question to anyone. I am quite curious about this in general.

Stuka87 · Apr 30, 2020

randomhero said:
Are pathways and different cache structure also calculated in that area?
And I mean this as a question to anyone. I am quite curious about this in general.

It does not include pathways. Its the amount of die space taken by those cores themselves. Having the RT functionality built into the shader cores themselves is a much more efficient use of die space. And it may turn out to be more efficient from a performance standpoint too.

uzzi38 · Apr 30, 2020

randomhero said:
Are pathways and different cache structure also calculated in that area?
And I mean this as a question to anyone. I am quite curious about this in general.

Probably best you direct it to everyone, because I can't for the life of me remember now

Been months since I heard the number.

RetroZombie · Apr 30, 2020

Glo. said:
Shouldn't it the be like this: Navi 23 - 48 CUs, Navi 22 - 64 CUs, Navi 21 - 96 CUs?

Also: Navi 23 - 48 CUs, 192 bit GDDR6 bit bus, Navi 22 - 64 CUs, 256 bit GDDR6 bus, Navi 21 - 96 CUs 384 GDDR6 bit bus?

Shouldn't it be this way?

Well my math is failed i was using the 320mm2 not the real 360mm2 from the xsx.

But if the xsx with 56CUs and with a 320 bit imc and all the other extras is just 360mm2, how is navi22 with 340mm^2 just 64CUs with a smaller memory bus than the xsx?
It have to be more, like 80CUs, and the big one some 120CUs.

It seams a lot, but the xsx got so small that it would be impossible that amd own gpus would be much bigger, too bad we don't know the sony ps5 die size just to compare.

Glo. · Apr 30, 2020

RetroZombie said:
Well my math is failed i was using the 320mm2 not the real 360mm2 from the xsx.

But if the xsx with 56CUs and with a 320 bit imc and all the other extras is just 360mm2, how is navi22 with 340mm^2 just 64CUs with a smaller memory bus than the xsx?
It have to be more, like 80CUs, and the big one some 120CUs.

It seams a lot, but the xsx got so small that it would be impossible that amd own gpus would be much bigger, too bad we don't know the sony ps5 die size just to compare.

Riiight, and the small sub 200 mm2 is 48 CU? 🙂)))

The math, assuming the CUs are 20% more transistor efficient might be correct and the counts are 240mm2 - 48 CUs, 64 for N22 and 96 for N21.

Safest bet after all is still: 40/60/80 CUs 🙂.

RetroZombie · Apr 30, 2020

Glo. said:
Riiight, and the small sub 200 mm2 is 48 CU? 🙂)))

Not sure maybe even less, the fixed stuff that have to exist in all models take too much space.
A great example of this is very big dies of the small nvidia gpus because of space that fixed stuff occupies.

TESKATLIPOKA · May 1, 2020

Glo. said:
Riiight, and the small sub 200 mm2 is 48 CU? 🙂)))

The math, assuming the CUs are 20% more transistor efficient might be correct and the counts are 240mm2 - 48 CUs, 64 for N22 and 96 for N21.

Safest bet after all is still: 40/60/80 CUs 🙂.

XBX size is only 361mm2 and that size includes 8 Zen cores, 56CU RDNA2, 320bit GDDR6 bus, southbridge I/O, sound and who knows what else, then I just can't understand why 340mm2 RDNA GPU would only have 60-64CU unless transistor density would be different between them.

Glo. · May 1, 2020

TESKATLIPOKA said:
XBX size is only 361mm2 and that size includes 8 Zen cores, 56CU RDNA2, 320bit GDDR6 bus, southbridge I/O, sound and who knows what else, then I just can't understand why 340mm2 RDNA GPU would only have 60-64CU unless transistor density would be different between them.

xTor/mm2 is EXACTLY the same between Zen 2 CPUs and RDNA1 GPUs for desktop and Xbox Series X's die.

But its so freaking outlandish to even think about the implications it brings for RDNA2.

TESKATLIPOKA · May 1, 2020

I don't see what's so freaking outlandish with that. Do you have any better comparison? It's true we don't know the exact size of the graphic part in XBX and never will, but just removing the cores should decrease the size under 300mm2 and there are still other things which are not present in a graphic chip. So I think even 72CU is not out of question for Navi 22.
Even If I am wrong It won't be a problem, this thread is for speculation.

DisEnchantment · May 1, 2020

Well... I have lots of time and I thought I might take a stab at this (measuring few blobs and pixelated die shots.)

I measured the various blocks for Navi 10 using the annotated die shots from Fritz and using ImageJ to measure the area to correlate and get the size of the blocks.

It turns out that total CU size for Navi10 is only 1/3 of its total die size and that a single CU (1/2 WGP) is quite small only ~2 mm2

So to speculate on the size for Big Navi... using Navi10 dimensions.
1.5x Memory+PHY (384Bit), 1x IO, 2x RBE(128ROPs), 4x L1, 2x ACE/HWS, 3x L2, 3x Raster/Primitive Unit and 2.5x WGP(100CUs) comes to about 500mm2. Keeping IO same, added 128bit more channel, doubled the ROP, L1, L2. Tripled the Raster

From Microsoft XSX die picture (which I have manipulated the perspective with GIMP), we take out the Zen2 Core, the PHYs etc, we have 175-185mm2 for the GPU cores if measured the same way as above.
ASSUMING 65% of the GPU core part (outside of IO and PHYs) is for the CUs (using the ratio of Navi 10) we have around ~2.2 mm2 for each CU. This could be because it seems XSX is fairly space efficient even when the HW RT is inside the CU.

So based on this, I could speculate that if the die size of 505 +/- 5% is correct,
1.5x Memory+PHY (384Bit), 1x IO, 2x RBE(128ROPs), 4x L1, 2x ACE/HWS, 3x L2, 3x Raster/Primitive Unit and 2.5x WGP(100CUs) comes to about 505mm2.
~100 CUs could indeed be possible, with double the ROPs, 384 Bit GDDR and so on.

And that is 25 mins of of excel math and GIMP/ImageJ skills put to speculation use.

Block measurements from ImageJ and Perspective changed image with GIMP

TESKATLIPOKA · May 1, 2020

Great job. I don't think we would see 128ROPs with just 384Bit bus, not to mention 50% more bandwidth won't be enough to feed a 100CU monster. 512Bit or HBM2 is the only possibility in my opinion. And CU will be bigger because of RT functionality, but still Interesting calculations.
BTW Isn't the picture from my link I posted in another thread showing Navi 10?

DisEnchantment · May 1, 2020

TESKATLIPOKA said:
Great job.

Well, it is a speculation job 😎

TESKATLIPOKA said:
BTW Isn't the picture from my link I posted in another thread?

They are sashleycat pictures indeed, same one from the link you have in the other thread.

TESKATLIPOKA · May 1, 2020

😀
I was happy when I found that picture. My thanks goes to Sashleycat for his great job on the picture and to Fritzchen Fritz for the picture.

Glo. · May 1, 2020

DisEnchantment said:
Well... I have lots of time and I thought I might take a stab at this (measuring few blobs and pixelated die shots.)
I measured the various blocks for Navi 10 using the annotated die shots from Fritz and using ImageJ to measure the area to correlate and get the size of the blocks.

View attachment 20419

It turns out that total CU size for Navi10 is only 1/3 of its total die size and that a single CU (1/2 WGP) is quite small only ~2 mm2

So to speculate on the size for Big Navi... using Navi10 dimensions.
1.5x Memory+PHY (384Bit), 1x IO, 2x RBE(128ROPs), 4x L1, 2x ACE/HWS, 4x L2, 3x Raster/Primitive Unit and 2.5x WGP(100CUs) comes to about 500mm2. Keeping IO same, added 128bit more channel, doubled the ROP, L1, L2. Tripled the Raster

From Microsoft XSX die picture (which I have manipulated the perspective with GIMP), we take out the Zen2 Core, the PHYs etc, we have 175-185mm2 for the GPU cores if measured the same way as above.
ASSUMING 65% of the GPU core part (outside of IO and PHYs) is for the CUs (using the ratio of Navi 10) we have around ~2.2 mm2 for each CU. This could be because it seems XSX is fairly space efficient even when the HW RT is inside the CU.

So based on this, I could speculate that if the die size of 505 +/- 5% is correct,
1.5x Memory+PHY (384Bit), 1x IO, 2x RBE(128ROPs), 4x L1, 2x ACE/HWS, 4x L2, 3x Raster/Primitive Unit and 2.5x WGP(100CUs) comes to about 505mm2.
~100 CUs could indeed be possible, with double the ROPs, 384 Bit GDDR and so on.

And that is 25 mins of of excel math and GIMP/ImageJ skills put to speculation use.

Block measurements from ImageJ and Perspective changed image with GIMP

View attachment 20417
View attachment 20414

Does all of this take into account RDNA2 having 20% more transistor efficient/Dense CUs, and that Caches are redesigned, again?

TESKATLIPOKA said:
I don't see what's so freaking outlandish with that. Do you have any better comparison? It's true we don't know the exact size of the graphic part in XBX and never will, but just removing the cores should decrease the size under 300mm2 and there are still other things which are not present in a graphic chip. So I think even 72CU is not out of question for Navi 22.
Even If I am wrong It won't be a problem, this thread is for speculation.

Why is it outlandish?

Because it creates a consequence that 240 mm2 die is 48 CU chip, with 192 Bit GDDR6 that replaces... RX 5500 XT, and is the rumored "Nvidia Killer", per sources of RedGamingTech.

You see why is it outlandish? I cannot believe in this scenario.

DisEnchantment · May 1, 2020

Glo. said:
Does all of this take into account RDNA2 having 20% more transistor efficient/Dense CUs, and that Caches are redesigned, again?

I did that but then I got the number of CUs to 110

, So I stopped it at that.

TESKATLIPOKA · May 1, 2020

Glo. said:
Why is it outlandish?

Because it creates a consequence that 240 mm2 die is 48 CU chip, with 192 Bit GDDR6 that replaces... RX 5500 XT, and is the rumored "Nvidia Killer", per sources of RedGamingTech.

You see why is it outlandish? I cannot believe in this scenario.

I can imagine such a chip but I don't believe they would sell It for the same price as 5500 XT when It should have ~2x better perforrmance.

TESKATLIPOKA · May 1, 2020

DisEnchantment said:
I did that but then I got the number of CUs to 110 , So I stopped it at that.

Then Let's stay with 96-100CU and 512bit GDDR6 bus. 😉

CastleBravo · May 1, 2020

TESKATLIPOKA said:
Great job. I don't think we would see 128ROPs with just 384Bit bus, not to mention 50% more bandwidth won't be enough to feed a 100CU monster. 512Bit or HBM2 is the only possibility in my opinion. And CU will be bigger because of RT functionality, but still Interesting calculations.
BTW Isn't the picture from my link I posted in another thread showing Navi 10?

I still think they will have a 384bit GDDR6 controller as well as an HBM2 controller on the die. Active CU count will probably be around 80 for the top end HBM GPU with maybe a few more than that on the die. Any defect in the GDDR controller shouldn't affect the yield for the top end HBM GPU model. If the top end model has some disabled CUs, it could withstand a defect there as well. Any chip with a more significant CU defect, or a problem with the HBM controller would be used for the GDDR6 GPU. In a perfect world, this might add up to an ~80 CU 505mm^2 monster GPU that doesn't need to be priced north of a grand. Of course, in the real world it just means more profit for AMD.

maddie · May 1, 2020

DisEnchantment said:
I did that but then I got the number of CUs to 110 , So I stopped it at that.

A memory of the past throws us back to the ATI 4xxx series. A 2.5x increase in shader units for a 1.33x increase in area

randomhero · May 1, 2020

DisEnchantment said:
Well... I have lots of time and I thought I might take a stab at this (measuring few blobs and pixelated die shots.)
I measured the various blocks for Navi 10 using the annotated die shots from Fritz and using ImageJ to measure the area to correlate and get the size of the blocks.

View attachment 20421

It turns out that total CU size for Navi10 is only 1/3 of its total die size and that a single CU (1/2 WGP) is quite small only ~2 mm2

So to speculate on the size for Big Navi... using Navi10 dimensions.
1.5x Memory+PHY (384Bit), 1x IO, 2x RBE(128ROPs), 4x L1, 2x ACE/HWS, 3x L2, 3x Raster/Primitive Unit and 2.5x WGP(100CUs) comes to about 500mm2. Keeping IO same, added 128bit more channel, doubled the ROP, L1, L2. Tripled the Raster

From Microsoft XSX die picture (which I have manipulated the perspective with GIMP), we take out the Zen2 Core, the PHYs etc, we have 175-185mm2 for the GPU cores if measured the same way as above.
ASSUMING 65% of the GPU core part (outside of IO and PHYs) is for the CUs (using the ratio of Navi 10) we have around ~2.2 mm2 for each CU. This could be because it seems XSX is fairly space efficient even when the HW RT is inside the CU.

So based on this, I could speculate that if the die size of 505 +/- 5% is correct,
1.5x Memory+PHY (384Bit), 1x IO, 2x RBE(128ROPs), 4x L1, 2x ACE/HWS, 3x L2, 3x Raster/Primitive Unit and 2.5x WGP(100CUs) comes to about 505mm2.
~100 CUs could indeed be possible, with double the ROPs, 384 Bit GDDR and so on.

And that is 25 mins of of excel math and GIMP/ImageJ skills put to speculation use.

Block measurements from ImageJ and Perspective changed image with GIMP

View attachment 20417
View attachment 20414

Wow man! You're on a roll!
This is like 3rd or 4th great post in last two days!
Well done, well done!

GodisanAtheist · May 1, 2020

TESKATLIPOKA said:
I can imagine such a chip but I don't believe they would sell It for the same price as 5500 XT when It should have ~2x better perforrmance.

- I figure current RDNA parts aren't going anywhere, they'll either be rebranded downward with a drop in price or RDNA 2 will stack on top of them if they maintain their price.

I think it really depends on where NV is in all the launch proceedings. If their RTX 3xxx series are still several months out from the RDNA2 launch, then AMD will go with the higher margins (so they can slash prices on RTX 3xxx launch), if not and they are closer than they appear, then they will make the more competitive play.

uzzi38 · May 1, 2020

GodisanAtheist said:
- I figure current RDNA parts aren't going anywhere, they'll either be rebranded downward with a drop in price or RDNA 2 will stack on top of them if they maintain their price.

I think it really depends on where NV is in all the launch proceedings. If their RTX 3xxx series are still several months out from the RDNA2 launch, then AMD will go with the higher margins (so they can slash prices on RTX 3xxx launch), if not and they are closer than they appear, then they will make the more competitive play.

If there is a 240mm^2 RDNA2 GPU, the entire RDNA1 lineup is dead. All in one go. Worthless.

Question Speculation: RDNA2 + CDNA Architectures thread

Platinum Member

Diamond Member

Member

Platinum Member

Member

Diamond Member

Platinum Member

Senior member

Diamond Member

Senior member

Platinum Member

Diamond Member

Platinum Member

Golden Member

Platinum Member

Golden Member

Platinum Member

Diamond Member

Golden Member

Platinum Member

Platinum Member

Member

Diamond Member

Member

Diamond Member

Platinum Member