Question Speculation: RDNA2 + CDNA Architectures thread

uzzi38 · Apr 28, 2020

All die sizes are within 5mm^2. The poster here has been right on some things in the past afaik, and to his credit was the first to saying 505mm^2 for Navi21, which other people have backed up. Even still though, take the following with a pich of salt.

Navi21 - 505mm^2

Navi22 - 340mm^2

Navi23 - 240mm^2

Source is the following post: https://www.ptt.cc/bbs/PC_Shopping/M.1588075782.A.C1E.html

TESKATLIPOKA · Jul 31, 2020

Geranium said:
You cant measure dedicated GPU die size from Custom APU.
Xbox's APU dont have many unit that the dedicated GPU will have. Xbox's APU has less sophisticated display engine, no PCI-e root complex, probably no FP64 unit, less complex encode/decode engine, memory controller without ECC support, also probably less cache.

Let's be honest adding 16 CU shouldn't increase the size by more than 50mm2. Keep in mind that this GPU doesn't have GDDR6 PHY and memory controller just HMB2 PHY or 8 CPU cores with L3 cache and other things, yet It is supposedly 60mm2 bigger than Xbox. Does Xbox One have FP64 capability or not?
Either Xbox has much higher density or RDNA2 GPU has more units than 72CU or no HBM2.

Timorous · Jul 31, 2020

TESKATLIPOKA said:
Let's be honest adding 16 CU shouldn't increase the size by more than 50mm2. Keep in mind that this GPU doesn't have GDDR6 PHY and memory controller just HMB2 PHY or 8 CPU cores with L3 cache and other things, yet It is supposedly 60mm2 bigger than Xbox. Does Xbox One have FP64 capability or not?
Either Xbox has much higher density or RDNA2 GPU has more units than 72CU or no HBM2.

Xbox Series X is around 42M xtors/mm^2.
The Zen 2 chiplet is around 50M xtors/mm^2 with 32MB cache.
RDNA is around 41M xtors/mm^2.
Renoir is around 63M xtors/mm^2.

Given a rumoured 505mm^2 die size for 'big navi' that comfortably doubles up on everything in Navi 10 without increasing density, the extra 16 PCIe lanes in such a doubling are probably large enough to off set the Ray Tracing hardware.

If density for RDNA2 increases to Renoir levels though then a 505mm^2 die gets you around 31B xtors which is 3x the number that Navi 10 has and means that 120 CUs would fit pretty comfortably even with a 512bit GDDR6 memory bus.

TESKATLIPOKA · Jul 31, 2020

Another leak is talking about only 72CU and 427mm2 and in my opinion that's a big size for such specs.

Timorous · Jul 31, 2020

TESKATLIPOKA said:
Another leak is talking about only 72CU and 427mm2 and in my opinion that's a big size for such specs.

That adds up in at current RDNA density. Perhaps AMD can really increase the clockspeeds (PS5 shows this might be possible) making a larger die unnecesary.

TESKATLIPOKA · Jul 31, 2020

I did some calculation based on DisEnchantment's measurements of Navi 10 Link.
RDNA1 GPU with 72CU, 96ROPs and 384bit bus would be 401mm^2. (L1 Cache, ACE / HWS, L2 Cache, Raster / Primitive Unit were doubled)
With this in mind 427mm^2 for RDNA2 could be correct If GDDR6 is used and not HBM2, but then I don't understand the size of Xbox X.
If I made RDNA1 GPU with 56CU, 64ROPs and 320bit, It would be 322mm^2. (L1 Cache, ACE / HWS, L2 Cache, Raster / Primitive Unit were increased by 50%)
So for the rest of SoC It leaves only 38mm^2 and we are talking about RDNA1 and not RDNA2, which should be bigger.

Glo. · Jul 31, 2020

What if it is 427 mm2 for HBM2 version, and 505 mm2 for GDDR6 version of the die?

This is my speculation based on what YOU guys speculate.

IMO, we will see N21 - 500 mm2, N22 - 340 mm2, N23 - 240 mm2.

The transistor density will be between 50 and 60 mln xTors/mm2. I won't speculate on CU counts.

TESKATLIPOKA · Jul 31, 2020

Glo. said:
What if it is 427 mm2 for HBM2 version, and 505 mm2 for GDDR6 version of the die?

384bit memory controller and PHY should be ~86mm^2 based on RDNA1 measurements and the difference between the two die sizes is 78mm^2 that would mean HBM2 PHY is only 8mm^2, which is very unlikely.
BTW I would think the larger one would use HBM2 which would leave more space for more execution units and save TBP for more performance which will allow higher selling price to minimize or offset the cost of HBM2.

TESKATLIPOKA · Jul 31, 2020

Glo. said:
This is my speculation based on what YOU guys speculate.

IMO, we will see N21 - 500 mm2, N22 - 340 mm2, N23 - 240 mm2.

The transistor density will be between 50 and 60 mln xTors/mm2. I won't speculate on CU counts.

That's ~25-50% higher density than RDNA1, I would love that. Such a high density would explain Xbox X, but It would mean a monster GPU.
If I used the higher density and made RDNA1 GPU with 160CU, 128ROPs and 512bit memory controller(basically 4x Navi 10 except ROPs and memory controller) It would be ~470-564 mm^2.

TESKATLIPOKA · Jul 31, 2020

Double post.

raghu78 · Jul 31, 2020

Glo. said:
What if it is 427 mm2 for HBM2 version, and 505 mm2 for GDDR6 version of the die?

This is my speculation based on what YOU guys speculate.

IMO, we will see N21 - 500 mm2, N22 - 340 mm2, N23 - 240 mm2.

The transistor density will be between 50 and 60 mln xTors/mm2. I won't speculate on CU counts.

We got the 505 sq mm die size for Navi 21 from multiple sources in Nov 2019. AcquarisZi from a chinese forum and Charlie from semiaccurate

Re: [情報] AMD Navi 14 還有 RX 5300 系列

推 gaddgao : 忽然好懷念水瓶座大大的八卦 11/06 00:49 把名字念對，是招喚成功的第一步~ 最近幾件case一起講然後我發現上一篇被貼到推特上還被人翻成英文了果然話不能講太明

www.ptt.cc

A new high end GPU just taped out

SemiAccurate’s moles have just sent us word a new high end enthusiast GPU just taped out

semiaccurate.com

This user AquarisZi also gave the die size for the remaining 2 dies - N22 - 340 and N23-240 . He mentioned the 3 dies are within a range of +-5 sq mm

Re: [情報] 傳 AMD Ryzen 5000 APU 除採 Zen 3 核心

有時候我也會覺得外面這群人也真的是很厲害隨便抓幾個名字就可以講的跟真的一樣 : : 由於 AMD 的 Zen 3 架構又將持續針對 CPU 架構進行改善，可預期除了基礎單核性能以 : 外，多核的協作延遲也將會縮短，至於 GPU 部分傳出將使用 RDNA 2 架構的 Navi 23 ，

www.ptt.cc

Moore's Law is Dead and Coreteks are speculating without having a full picture of the details or doing some basic analysis. Given the 12 TF Xbox Series X with 360 sq mm die size running at an estimated 115-120w (according to my calculations based on Series X PSU rating) the area and power efficiency of RDNA2 is very good. The only remaining question is did AMD design HBM2 and GDDR6 memory controllers on the same die. Its a very reasonable decision if AMD wanted to use GDDR6 for the heavily cut Navi 21

Radeon 6800XT - 72 CU, 3SE, 6SA, 96 ROPs, 10-12 GB GDDR6 , 720-768 GB/s (16-18 Gbps GDDR6)

and HBM2E for the top 2 SKUs

Radeon 6900XT - 80 CU, 4SE, 8SA, 128 ROPs, 16 GB HBM2E, 920 GB/s (Hynix 3.6 Gbps HBM2E)
Radeon 6950XT - 96 CU, 4SE, 8SA, 128 ROPs, 16 GB HBM2E, 920 GB/s (Hynix 3.6 Gbps HBM2E)

Radeon Pro WX GPUs for Windows would also use HBM2E and so would Apple if they want a massively powerful GPU for their Mac Pro workstations.

In conclusion my expectation is AMD have a product/architecture capable of taking the GPU crown from Nvidia and I fully expect them to do that.

Kenmitch · Jul 31, 2020

raghu78 said:
In conclusion my expectation is AMD have a product/architecture capable of taking the GPU crown from Nvidia and I fully expect them to do that.

At this moment in time we have 2 hype trains heading full speed ahead, while on the same track. Until they collide we're just gonna have to take the wait and see approach....Once the dust settles.

Interesting times ahead. Would be a mighty feat if AMD does pull it off.

raghu78 · Jul 31, 2020

Kenmitch said:
At this moment in time we have 2 hype trains heading full speed ahead, while on the same track. Until they collide we're just gonna have to take the wait and see approach....Once the dust settles.

Interesting times ahead. Would be a mighty feat if AMD does pull it off.

This time around we have actual specifications on RDNA2 based Xbox Series X GPU to arrive at a reasonable estimate of Navi 21 performance. I think AMD is sandbagging big time.

Navi 10 board power - 225w (9 TF at 1755 Mhz game clock) = 9000 GLOPS / 225 = 40 GLOPS/watt
Series X GPU with GDDR6 memory- 140-150w (12 TF at 1825 Mhz fixed clock) = 12000 GFLOPS/ 150 = 80 GLOPS/watt

There is sufficient data to prove RDNA2 is very area and power efficient. The fact that Nvidia are pushing 350w on GA102 (built on Samsung 8nm) is a clue as to how badly they are trying to keep the GPU crown. But I don't think that will save them.

maddie · Jul 31, 2020

raghu78 said:
This time around we have actual specifications on RDNA2 based Xbox Series X GPU to arrive at a reasonable estimate of Navi 21 performance. I think AMD is sandbagging big time.

Navi 10 board power - 225w (9 TF at 1755 Mhz game clock) = 9000 GLOPS / 225 = 40 GLOPS/watt
Series X GPU with GDDR6 memory- 140-150w (12 TF at 1825 Mhz fixed clock) = 12000 GFLOPS/ 150 = 80 GLOPS/watt

There is sufficient data to prove RDNA2 is very area and power efficient. The fact that Nvidia are pushing 350w on GA102 (built on Samsung 8nm) is a clue as to how badly they are trying to keep the GPU crown. But I don't think that will save them.

All the talk of RDNA2 being larger for the power efficiency improvement ignores the other way to obtain less power consumption. Less circuitry in use, also likely leading to higher clocks. Less transistors to switch, simpler, less power hungry clock trees, etc.

raghu78 · Jul 31, 2020

So rogame has confirmed Navi 21 is 80CU based on information found in firmware files.

I think there is GDDR6 memory controller / PHY and HBM2E PHY in Navi 21 die based on this latest information

https://twitter.com/x/status/1289239501647171584

Stuka87 · Jul 31, 2020

maddie said:
All the talk of RDNA2 being larger for the power efficiency improvement ignores the other way to obtain less power consumption. Less circuitry in use, also likely leading to higher clocks. Less transistors to switch, simpler, less power hungry clock trees, etc.

While true, cards are normally designed for a performance window, not an efficiency window.

Chips end up with a window where they both perform well, and are efficient. If you clock them past that window, power consumption skyrockets. And you will ultimately hit a clock speed wall. AMD has had a history of having to factory OC cards to meet performance windows which results in high power consumption. Its in their best interest to design a larger GPU that can run at a lower clock, so that its both fast and efficient.

TESKATLIPOKA · Jul 31, 2020

raghu78 said:
The only remaining question is did AMD design HBM2 and GDDR6 memory controllers on the same die. Its a very reasonable decision if AMD wanted to use GDDR6 for the heavily cut Navi 21

Isn't memory controller of HBM2 actually part of the memory stack and the GPU only contains the PHY? If this is so, then I don't think having additional GDDR6 controller and PHY such a great idea, It takes up a lot of space on a chip.

Timorous · Jul 31, 2020

raghu78 said:
So rogame has confirmed Navi 21 is 80CU based on information found in firmware files.

I think there is GDDR6 memory controller / PHY and HBM2E PHY in Navi 21 die based on this latest information

https://twitter.com/x/status/1289239501647171584

View attachment 27275

Still 64 ROPs. Wonder if that is implying a large clockspeed increase to make up for the lack of units.

TESKATLIPOKA · Jul 31, 2020

Timorous said:
Still 64 ROPs. Wonder if that is implying a large clockspeed increase to make up for the lack of units.

Even If you increase the clock speed the CU/ROPs ratio will stay the same and It will be halved compared to Navi 10. I am quite sceptical about only 64 ROPs( actually only 16 Backends) unless they are more capable than the previous ones.

RDNA1 GPU with 80CU, 64ROPs and 384bit memory controller would be 409mm^2 with the same transistor density as Navi 10.

maddie · Jul 31, 2020

Stuka87 said:
While true, cards are normally designed for a performance window, not an efficiency window.

Chips end up with a window where they both perform well, and are efficient. If you clock them past that window, power consumption skyrockets. And you will ultimately hit a clock speed wall. AMD has had a history of having to factory OC cards to meet performance windows which results in high power consumption. Its in their best interest to design a larger GPU that can run at a lower clock, so that its both fast and efficient.

I'm not talking about the overall die size but the individual CUs and the functional blocks. Less impedance. You can then use this to place more in the die for your end goal.

raghu78 · Jul 31, 2020

TESKATLIPOKA said:
Isn't memory controller of HBM2 actually part of the memory stack and the GPU only contains the PHY? If this is so, then I don't think having additional GDDR6 controller and PHY such a great idea, It takes up a lot of space on a chip.

You are right. So the Navi 21 die is likely to have GDDR6 memory controller+ GDDR6 PHY + HBM2E PHY (if it supports both memory types).

DisEnchantment · Jul 31, 2020

raghu78 said:
You are right. I corrected that in a later post. So the Navi 21 die is likely to have GDDR6 memory controller+ GDDR6 PHY + HBM2E PHY

Sienna Cichlid has HBM. Navy Flounder has GDDR6. Mesa indicates both Sienna and Navy to be GFX1030/Navi21.
These two are not the same die. They have different SMU configurations.

Veradun · Jul 31, 2020

TESKATLIPOKA said:
Even If you increase the clock speed the CU/ROPs ratio will stay the same and It will be halved compared to Navi 10. I am quite sceptical about only 64 ROPs( actually only 16 Backends) unless they are more capable than the previous ones.

RDNA1 GPU with 80CU, 64ROPs and 384bit memory controller would be 409mm^2 with the same transistor density as Navi 10.

Also: Navi10 has 2 SEs and 16RBEs having 4 SEs and still 16RBEs seems to be a strange move. Something might be off here.

edit: fresh tapatalk installation wanted to post its signature thing

Olikan · Jul 31, 2020

Rumor from reddit: the RBE can do more than 4pix/clock...

jpiniero · Jul 31, 2020

DisEnchantment said:
Sienna Cichlid has HBM. Navy Flounder has GDDR6. Mesa indicates both Sienna and Navy to be GFX1030/Navi21.
These two are not the same die. They have different SMU configurations.

Be pretty strange to do two different dies with the same shader counts except one being HBM2 and the other GDDR6. I would say that having both would be more likely although that would be pretty unusual.

DisEnchantment · Jul 31, 2020

jpiniero said:
Be pretty strange to do two different dies with the same shader counts except one being HBM2 and the other GDDR6. I would say that having both would be more likely although that would be pretty unusual.

There is more to that than just shader count though. Sienna has XGMI, dual VCN, besides others which Navy Flounder does not have.

Question Speculation: RDNA2 + CDNA Architectures thread

Platinum Member

Platinum Member

Golden Member

Platinum Member

Golden Member

Platinum Member

Diamond Member

Platinum Member

Platinum Member

Platinum Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Platinum Member

Golden Member

Platinum Member

Diamond Member

Diamond Member

Golden Member

Senior member

Platinum Member

Lifer

Golden Member