Question Speculation: RDNA2 + CDNA Architectures thread

uzzi38

Platinum Member
Oct 16, 2019
2,556
5,531
146
All die sizes are within 5mm^2. The poster here has been right on some things in the past afaik, and to his credit was the first to saying 505mm^2 for Navi21, which other people have backed up. Even still though, take the following with a pich of salt.

Navi21 - 505mm^2

Navi22 - 340mm^2

Navi23 - 240mm^2

Source is the following post: https://www.ptt.cc/bbs/PC_Shopping/M.1588075782.A.C1E.html
 

Hitman928

Diamond Member
Apr 15, 2012
5,160
7,595
136
So Navi21 basically twice the size of 5700XT. If they can get even 70% performance scaling from that, it's going to be a beast of a gaming chip and hold the performance crown until whenever the next gen Nvidia GPU launches.
 

Ajay

Lifer
Jan 8, 2001
15,332
7,792
136
That's weird, listing die sizes on a shopping site? And with no units?
 
Last edited by a moderator:

uzzi38

Platinum Member
Oct 16, 2019
2,556
5,531
146
That's weird, listing die sizes on a shopping site? And with no units?
This site gets leeks posted on it all the time, like everything Sharkbay posts he posts on here for example.
 
Last edited by a moderator:

DisEnchantment

Golden Member
Mar 3, 2017
1,587
5,703
136
All die sizes are within 5mm^2
I imagine he said this probably because he was only eyeballing it or took a pic and did not have the official dimensions.

Anyone care to speculate about possible CU counts, solely based on XSX die size, and CU density?

Assuming 320mm^2 for XSX GPU with 56CUs (including disabled ones), and assuming Mem controllers scale lineraly with CU count (because why not :D)

Navi21 - 505mm^2 = 88.3CUs
Navi22 - 340mm^2 = 59.5CUs
Navi23 - 240mm^2 = 42CUs

It makes sense for the Navi23 part to be the smallest considering the planned integration with Cezanne H.
Also interesting in the thread is the mention that Navi2X is no longer 7nm according to TSMC's definition.

The good indications, at least for clock speeds, is that the PS5 GPU is running at 2.23 GHz in a console and David Wangs verbal statement of "Multi Gigahertz" frequencies for RDNA2 during FAD2020.
So for desktop it would be safe to expect 2.3 GHz for the smallest Navi2X die.

TSMC's N7 went into HVM on April 2018 and Instinct MI60 was launched on November 2018. I have a small hope that CDNA1 could launch with TSMC's N5 or N6 otherwise.

AMD's quarterly report is in few hours and we will know if console chips were also being shipped.

PS:
@uzzi38
Should have made a more generic RDNA2 (and CDNA) thread. We can discuss more things besides die sizes :).
You can edit the thread title to something like Speculation: Next Gen RDNA2/CDNA1 thread.
 

Glo.

Diamond Member
Apr 25, 2015
5,642
4,379
136
I'm expecting the desktop chips to be less dense than the consoles for clock reasons.
Then they would have to be less dense than RDNA1 GPUs, considering the XSX has the same xTor/mm2 density as Navi 10 GPUs.

And that is not possible, because RDNA2 CUs appear to be around 20% more transistor efficient than RDNA1 CUs.
 

randomhero

Member
Apr 28, 2020
180
247
86
So Navi21 basically twice the size of 5700XT. If they can get even 70% performance scaling from that, it's going to be a beast of a gaming chip and hold the performance crown until whenever the next gen Nvidia GPU launches.

Well, if nvidia squeezed around 60 % more performance with 2080ti than 2060 I sure hope they do that at the minimum with new arch and tvice the size.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,587
5,703
136
I hope this year we could expect HBM to get cheaper with the Pyongtaek fab(P1) going full steam for a few quarters now and a second one (P2) launched in the same plot will start producing memory this year, in addition to the 450K wpm currently being handled in this single P1 fab


At the same time Xi'an is also ramping up. Chinese Govt. is allowing Samsung Foundry engineers to move freely even during the covid situation. This fab might not produce Memory but will take wafer share from the Pyongtaek fabs to increase memory chip output.

Samsung's memory operations are ginormous.

Micron is also joining HBM bandwagon in addition to Samsung and SK Hynix and is starting operation this year.


X3D architecture could not have come at a better time.
 

Glo.

Diamond Member
Apr 25, 2015
5,642
4,379
136
Considering how GDDR6 will get cheaper in upcoming months/years with sheer volume manufacturing I would not put my hopes high for HBM2 on any consumer products, unless...


... any of the companies will be extremely desperate to remain relevant in consumer space for one reason or another.
 
  • Like
Reactions: Tlh97 and uzzi38

Ajay

Lifer
Jan 8, 2001
15,332
7,792
136
I think the idea of anything tech related getting cheaper soon time is unlikely. Cheaper than the inflated prices we have now, I can see that. Cheaper than pre-virus, unlikely.
Yeah, 7nm is pretty expensive. DRAM is cheaper, but AIBs will probably just grab the extra margins instead of lowering prices.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,587
5,703
136
Considering how GDDR6 will get cheaper in upcoming months/years with sheer volume manufacturing I would not put my hopes high for HBM2 on any consumer products, unless...


... any of the companies will be extremely desperate to remain relevant in consumer space for one reason or another.
Mmmmmm .... I am not specifically talking about HBM for consumer devices. Of course CDNA is not a consumer product and current Instinct series use HBM2.

Having said that, it remains to be seen how expensive HBM2e is going to be. Right now there is excess DRAM capacity even without the P2 fab coming online.
HBM2 wafer availability increased many times over what is available two years ago because of numerous fab lines coming online like the gigantic fabs which I mentioned in the sources above.
Additionally, right now for the CDNA GPUs TSMC can do HBM2 integration from the KGSD dies with CoWoS in-house if needed.
Compare this to V64, where they need to get the KGSDs from Hynix, the GPU die from GloFo, the interposer from UMC and send them to SPIL who will integrate them all.

As regards HBM2e vs DDR5 in the context of X3D and CDNA2, it is going to be an interesting comparison because DDR5 is more complex than DDR4, there is onboard PMIC and ECC for Read and write plus double the channels, it will be costlier to manufacture.

In this context of X3D with HBM2e for example, 4 x 1 Hi stack (16Gb per die and 4096 bit wide, 4x1024) would not even need TSVs between stacked dies and the base die could already be an interposer. For this specific case of 8GB on chip memory it is basically four single DRAM dies and everything else is there. If they can hit those 3.8 Gbps (SK Hynix advertised max speed) or 4.1 Gbps (Samsung advertise max speed), its can be a mind boggling ~2 TB/sec.

Futhermore, if we want to take this concept further, a special SoC on N5 can have enough DRAM for GPU and system and has enough density to pack a GPU to boot. e.g. XSX SoC + RAM on N5 would be around 300mm2 which is manageable... went a bit ahead of myself here.
 
Last edited:

RetroZombie

Senior member
Nov 5, 2019
464
386
96
Assuming 320mm^2 for XSX GPU with 56CUs (including disabled ones), and assuming Mem controllers scale lineraly with CU count (because why not :D)

Navi21 - 505mm^2 = 88.3CUs
Navi22 - 340mm^2 = 59.5CUs
Navi23 - 240mm^2 = 42CUs

And in that also the cpu cores, all i/o (pcie, usb, ...), special sound units, ms own mojo, ...

My math is different than yours, assuming a fixed 120mm2 for the uncore leaving the rest just for the units, would leave:
Navi21* - 505mm^2 = 100CUs
Navi22 - 340mm^2 = 64CUs
Navi23 - 240mm^2 = 40CUs

Used the XSX has reference with 200mm2 for units.
*Two different memory types supported on this one.
 

Glo.

Diamond Member
Apr 25, 2015
5,642
4,379
136
And in that also the cpu cores, all i/o (pcie, usb, ...), special sound units, ms own mojo, ...

My math is different than yours, assuming a fixed 120mm2 for the uncore leaving the rest just for the units, would leave:
Navi21* - 505mm^2 = 100CUs
Navi22 - 340mm^2 = 64CUs
Navi23 - 240mm^2 = 40CUs

Used the XSX has reference with 200mm2 for units.
*Two different memory types supported on this one.
Shouldn't it the be like this: Navi 23 - 48 CUs, Navi 22 - 64 CUs, Navi 21 - 96 CUs?

Also: Navi 23 - 48 CUs, 192 bit GDDR6 bit bus, Navi 22 - 64 CUs, 256 bit GDDR6 bus, Navi 21 - 96 CUs 384 GDDR6 bit bus?

Shouldn't it be this way? And this is only speculation. CU counts for consumer products are kept in deepest secret, for now.
 

randomhero

Member
Apr 28, 2020
180
247
86
I am leaning towards 36,60 and 80 CUs with 256 and 384 bit buses.
All savings improved process brings will be gobbled up by additional RT hardware and probably cache.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,587
5,703
136
Also what has really been very ambiguous is that AMD now suddenly calls everything 7nm. This is done on purpose for sure.
And also TSMC is calling N7, N7+, N7P, N6 as "7nm family" and a customer can call their products 7nm as long as it is one of these.
Additionally, there are HD and HP cell libraries to throw in the mix.

Now it is anybody's guess.

And in that also the cpu cores, all i/o (pcie, usb, ...), special sound units, ms own mojo, ...

My math is different than yours, assuming a fixed 120mm2 for the uncore leaving the rest just for the units, would leave:
Navi21* - 505mm^2 = 100CUs
Navi22 - 340mm^2 = 64CUs
Navi23 - 240mm^2 = 40CUs

Used the XSX has reference with 200mm2 for units.
*Two different memory types supported on this one.

Indeed a more sensible approach is to get the XSX die shot and remove the rest except Mem controllers and CU arrays. but we don;t have.

I am leaning towards 36,60 and 80 CUs with 256 and 384 bit buses.
All savings improved process brings will be gobbled up by additional RT hardware and probably cache.

RDNA2's RT HW is inside the CU. 4 each in one CU.
Cache config could indeed be different.