Glo.
Diamond Member
- Apr 25, 2015
- 4,653
- 3,281
- 136
Ding, Ding, Ding.Cache config could indeed be different.
Ding, Ding, Ding.Cache config could indeed be different.
Yes, I'm aware of that. It still takes transistors.RDNA2's RT HW is inside the CU. 4 each in one CU.
Cache config could indeed be different.
For reference Turing's takes up about 3% of the die. Then another 5-6% on Tensor Cores.Yes, I'm aware of that. It still takes transistors.
Are pathways and different cache structure also calculated in that area?For reference Turing's takes up about 3% of the die. Then another 5-6% on Tensor Cores.
It does not include pathways. Its the amount of die space taken by those cores themselves. Having the RT functionality built into the shader cores themselves is a much more efficient use of die space. And it may turn out to be more efficient from a performance standpoint too.Are pathways and different cache structure also calculated in that area?
And I mean this as a question to anyone. I am quite curious about this in general.
Probably best you direct it to everyone, because I can't for the life of me remember nowAre pathways and different cache structure also calculated in that area?
And I mean this as a question to anyone. I am quite curious about this in general.
Well my math is failed i was using the 320mm2 not the real 360mm2 from the xsx.Shouldn't it the be like this: Navi 23 - 48 CUs, Navi 22 - 64 CUs, Navi 21 - 96 CUs?
Also: Navi 23 - 48 CUs, 192 bit GDDR6 bit bus, Navi 22 - 64 CUs, 256 bit GDDR6 bus, Navi 21 - 96 CUs 384 GDDR6 bit bus?
Shouldn't it be this way?
Riiight, and the small sub 200 mm2 is 48 CU?Well my math is failed i was using the 320mm2 not the real 360mm2 from the xsx.
But if the xsx with 56CUs and with a 320 bit imc and all the other extras is just 360mm2, how is navi22 with 340mm^2 just 64CUs with a smaller memory bus than the xsx?
It have to be more, like 80CUs, and the big one some 120CUs.
It seams a lot, but the xsx got so small that it would be impossible that amd own gpus would be much bigger, too bad we don't know the sony ps5 die size just to compare.
Not sure maybe even less, the fixed stuff that have to exist in all models take too much space.Riiight, and the small sub 200 mm2 is 48 CU?)))
XBX size is only 361mm2 and that size includes 8 Zen cores, 56CU RDNA2, 320bit GDDR6 bus, southbridge I/O, sound and who knows what else, then I just can't understand why 340mm2 RDNA GPU would only have 60-64CU unless transistor density would be different between them.Riiight, and the small sub 200 mm2 is 48 CU?)))
The math, assuming the CUs are 20% more transistor efficient might be correct and the counts are 240mm2 - 48 CUs, 64 for N22 and 96 for N21.
Safest bet after all is still: 40/60/80 CUs.
xTor/mm2 is EXACTLY the same between Zen 2 CPUs and RDNA1 GPUs for desktop and Xbox Series X's die.XBX size is only 361mm2 and that size includes 8 Zen cores, 56CU RDNA2, 320bit GDDR6 bus, southbridge I/O, sound and who knows what else, then I just can't understand why 340mm2 RDNA GPU would only have 60-64CU unless transistor density would be different between them.
Well, it is a speculation jobGreat job.
They are sashleycat pictures indeed, same one from the link you have in the other thread.BTW Isn't the picture from my link I posted in another thread?
Does all of this take into account RDNA2 having 20% more transistor efficient/Dense CUs, and that Caches are redesigned, again?Well... I have lots of time and I thought I might take a stab at this (measuring few blobs and pixelated die shots.)![]()
I measured the various blocks for Navi 10 using the annotated die shots from Fritz and using ImageJ to measure the area to correlate and get the size of the blocks.
View attachment 20419
It turns out that total CU size for Navi10 is only 1/3 of its total die size and that a single CU (1/2 WGP) is quite small only ~2 mm2
So to speculate on the size for Big Navi... using Navi10 dimensions.
1.5x Memory+PHY (384Bit), 1x IO, 2x RBE(128ROPs), 4x L1, 2x ACE/HWS, 4x L2, 3x Raster/Primitive Unit and 2.5x WGP(100CUs) comes to about 500mm2. Keeping IO same, added 128bit more channel, doubled the ROP, L1, L2. Tripled the Raster
From Microsoft XSX die picture (which I have manipulated the perspective with GIMP), we take out the Zen2 Core, the PHYs etc, we have 175-185mm2 for the GPU cores if measured the same way as above.
ASSUMING 65% of the GPU core part (outside of IO and PHYs) is for the CUs (using the ratio of Navi 10) we have around ~2.2 mm2 for each CU. This could be because it seems XSX is fairly space efficient even when the HW RT is inside the CU.
So based on this, I could speculate that if the die size of 505 +/- 5% is correct,
1.5x Memory+PHY (384Bit), 1x IO, 2x RBE(128ROPs), 4x L1, 2x ACE/HWS, 4x L2, 3x Raster/Primitive Unit and 2.5x WGP(100CUs) comes to about 505mm2.
~100 CUs could indeed be possible, with double the ROPs, 384 Bit GDDR and so on.
And that is 25 mins of of excel math and GIMP/ImageJ skills put to speculation use.
Block measurements from ImageJ and Perspective changed image with GIMP
Why is it outlandish?I don't see what's so freaking outlandish with that. Do you have any better comparison? It's true we don't know the exact size of the graphic part in XBX and never will, but just removing the cores should decrease the size under 300mm2 and there are still other things which are not present in a graphic chip. So I think even 72CU is not out of question for Navi 22.
Even If I am wrong It won't be a problem, this thread is for speculation.
I did that but then I got the number of CUs to 110Does all of this take into account RDNA2 having 20% more transistor efficient/Dense CUs, and that Caches are redesigned, again?
I can imagine such a chip but I don't believe they would sell It for the same price as 5500 XT when It should have ~2x better perforrmance.Why is it outlandish?
Because it creates a consequence that 240 mm2 die is 48 CU chip, with 192 Bit GDDR6 that replaces... RX 5500 XT, and is the rumored "Nvidia Killer", per sources of RedGamingTech.
You see why is it outlandish? I cannot believe in this scenario.
Then Let's stay with 96-100CU and 512bit GDDR6 bus.I did that but then I got the number of CUs to 110, So I stopped it at that.
I still think they will have a 384bit GDDR6 controller as well as an HBM2 controller on the die. Active CU count will probably be around 80 for the top end HBM GPU with maybe a few more than that on the die. Any defect in the GDDR controller shouldn't affect the yield for the top end HBM GPU model. If the top end model has some disabled CUs, it could withstand a defect there as well. Any chip with a more significant CU defect, or a problem with the HBM controller would be used for the GDDR6 GPU. In a perfect world, this might add up to an ~80 CU 505mm^2 monster GPU that doesn't need to be priced north of a grand. Of course, in the real world it just means more profit for AMD.Great job. I don't think we would see 128ROPs with just 384Bit bus, not to mention 50% more bandwidth won't be enough to feed a 100CU monster. 512Bit or HBM2 is the only possibility in my opinion. And CU will be bigger because of RT functionality, but still Interesting calculations.
BTW Isn't the picture from my link I posted in another thread showing Navi 10?
A memory of the past throws us back to the ATI 4xxx series. A 2.5x increase in shader units for a 1.33x increase in areaI did that but then I got the number of CUs to 110, So I stopped it at that.
Wow man! You're on a roll!Well... I have lots of time and I thought I might take a stab at this (measuring few blobs and pixelated die shots.)![]()
I measured the various blocks for Navi 10 using the annotated die shots from Fritz and using ImageJ to measure the area to correlate and get the size of the blocks.
View attachment 20421
It turns out that total CU size for Navi10 is only 1/3 of its total die size and that a single CU (1/2 WGP) is quite small only ~2 mm2
So to speculate on the size for Big Navi... using Navi10 dimensions.
1.5x Memory+PHY (384Bit), 1x IO, 2x RBE(128ROPs), 4x L1, 2x ACE/HWS, 3x L2, 3x Raster/Primitive Unit and 2.5x WGP(100CUs) comes to about 500mm2. Keeping IO same, added 128bit more channel, doubled the ROP, L1, L2. Tripled the Raster
From Microsoft XSX die picture (which I have manipulated the perspective with GIMP), we take out the Zen2 Core, the PHYs etc, we have 175-185mm2 for the GPU cores if measured the same way as above.
ASSUMING 65% of the GPU core part (outside of IO and PHYs) is for the CUs (using the ratio of Navi 10) we have around ~2.2 mm2 for each CU. This could be because it seems XSX is fairly space efficient even when the HW RT is inside the CU.
So based on this, I could speculate that if the die size of 505 +/- 5% is correct,
1.5x Memory+PHY (384Bit), 1x IO, 2x RBE(128ROPs), 4x L1, 2x ACE/HWS, 3x L2, 3x Raster/Primitive Unit and 2.5x WGP(100CUs) comes to about 505mm2.
~100 CUs could indeed be possible, with double the ROPs, 384 Bit GDDR and so on.
And that is 25 mins of of excel math and GIMP/ImageJ skills put to speculation use.
Block measurements from ImageJ and Perspective changed image with GIMP
- I figure current RDNA parts aren't going anywhere, they'll either be rebranded downward with a drop in price or RDNA 2 will stack on top of them if they maintain their price.I can imagine such a chip but I don't believe they would sell It for the same price as 5500 XT when It should have ~2x better perforrmance.
If there is a 240mm^2 RDNA2 GPU, the entire RDNA1 lineup is dead. All in one go. Worthless.- I figure current RDNA parts aren't going anywhere, they'll either be rebranded downward with a drop in price or RDNA 2 will stack on top of them if they maintain their price.
I think it really depends on where NV is in all the launch proceedings. If their RTX 3xxx series are still several months out from the RDNA2 launch, then AMD will go with the higher margins (so they can slash prices on RTX 3xxx launch), if not and they are closer than they appear, then they will make the more competitive play.