Question Speculation: RDNA3 + CDNA2 Architectures Thread

uzzi38 · Jan 23, 2021

Man I have been dying to make this one for a while now.

First rumours for RDNA3 are here so new thread time!

Just going to start off with this one for now: kopite7kimi on Twitter: "@VideoCardz Ah, I mean a simple mcm design with 10240 cores is not enough. Because the lift from RDNA2 to RDNA3 is much bigger than from RDNA1 to RDNA2. We should expect many big improvements in GFX11. 🤔" / Twitter

Aapje · Jul 19, 2022

jpiniero said:
N22 you mean.

Sorry, that was a typo. meant N34.

I can see AMD wanting to go all out with a very expensive halo product and separating the pricing of their tiers more. So something like:

7900 XT = N31 @ $1200-$1500
7900 = cut N31 @ $900-$1000
7800 XT = N32 @ $700-$800
7800 = cut N32 @ $600-$700
7700 XT = N33 @ $550
7700 = cut N33 @ $500
7600 XT = N34 @ $400
7600 = cut N34 @300

jpiniero · Jul 19, 2022

Those prices (minus the N34 which doesn't exist ATM) are only realistic if Ada is much faster. Which is unlikely.

TESKATLIPOKA · Jul 19, 2022

DisEnchantment said:
....
View attachment 64752
...

What is IF in that table? XGMI?
Is this table correct? I tried to calculate N23 by subtracting the missing parts, but I got 273.5mm^2, which is a lot more than the real size of N23.
524 - ( 2.1134752 × 48 ) - ( 61.239 ÷ 2 ) - ( 13.28489 ÷ 2 ) - ( 55.3537 ÷ 2 ) - ( 87.4589 ÷ 4 ) - 62.2729 = 273.5mm^2

edit: Thanks, now I see where I made the mistake, I subtracted only 32MB instead of 96MB. And I completely removed Infinity fabric.

SteinFG · Jul 19, 2022

TESKATLIPOKA said:
What is IF in that table? XGMI?
Is this table correct? I tried to calculate N23 by substracting the missing parts, but I got 273.5mm^2, which is a lot more than the real size of N23.
524 - ( 2.1134752 × 48 ) - ( 61.239 ÷ 2 ) - ( 13.28489 ÷ 2 ) - ( 55.3537 ÷ 2 ) - ( 87.4589 ÷ 4 ) - 62.2729 = 273.5mm^2

IF link is ~20mm by my napkin math. so subtract it, then divide by 2 for about half of everything, and then subtract 8 more CUs, you get 235mm, which is close enough for me.

DisEnchantment · Jul 19, 2022

TESKATLIPOKA said:
What is IF in that table? XGMI?
Is this table correct? I tried to calculate N23 by substracting the missing parts, but I got 273.5mm^2, which is a lot more than the real size of N23.
524 - ( 2.1134752 × 48 ) - ( 61.239 ÷ 2 ) - ( 13.28489 ÷ 2 ) - ( 55.3537 ÷ 2 ) - ( 87.4589 ÷ 4 ) - 62.2729 = 273.5mm^2

Yes, IF is XGMI/Infinity Fabric.
You did not account for IC properly, [ 524 - ( 2.1134752 × 48 ) - ( 61.239 ÷ 2 ) - ( 13.28489 ÷ 2 ) - ( 55.3537 ÷ 2 ) - 3 * ( 87.4589 ÷ 4 ) - 62.2729 = ~~273.5mm^2~~ ]

Size should be close if you apply same table calculation to N23

Just cut half the PCIe PHY from this it is about there, N23 has only 32MB IC

Frenetic Pony · Jul 19, 2022

Given the vast increase in compute work over almost anything, maybe increasing SIMD to other resource ratio makes sense. Compute is increasingly butting in on the geometry pipeline now for an increasing number of titles, you need a lot of it for raytracing, etc. Smooshing more into a compute unit could make that L0 thing more worthwhile as well, allowing for L0 reuse in a compact area. And if AMD wants to raise performance above double a 6900xt they need better bandwidth usage or a 512bit bus. So I guess overall I can see the argument for it, especially if the instruction overlap and L0 scheme can indeed get a decent amount of work out of these extra SIMDs without causing a vast increase in silicon. Performance per SIMD would still probably go down, but performance per mm would go up overall, as would perf per watt.

As a guess based on that patch information (2 memory dies, 6 compute dies? for one chip) for how the lineup might look:

Compute die: 32 simds. Memory die: 192bit bus.
384bit bus/192 simds/384mb cache/24gb@24gbps ram/$1500-2000
384bit bus/128 simds/192mb cache/12gb@18-20gbps ram/$800-$1200
192bit bus/64 simds/12gb@24gbps ram/$500-600
192bit bus/56 simds/12gb@18-20gbps ram/$359-459
APU(dgpu monolithic or other coming later?):
128bit bus/32 simds/8gb@18-20gbps/$249-329 (ram and bus only applicable to DGPU)

Note the cut down (56simd) isn't considered a "seperate/new" gpu in some technical documentation as it would have the same configuration of chips as the none cut down one. Because moving to small chiplets would see so many good dies vs salvaged ones, and because the configurations of non salvaged dies might play out like above, I'm not sure I can see a lot of configurations using salvaged dies when one gap filling one needed to for the highest volume area might do. There's already a dearth of salvage AMD dies like the mostly missing rx6800 non xt, and it's only going to get worse with chiplets.

TESKATLIPOKA · Jul 20, 2022

3DCenter.org

Navi 21 without IC und IF$ is ~375mm² (~28% of the 519mm² die is used for memory controller and Infinity Cache). Thus, the GCD of Navi 31 is still quite small on ~350mm², but still somewhat better understandable.

This should confirm N31 has a lot more transistors in a similar die size.
BTW only GCD is 5nm, memory chiplets should be 6nm based on info from Kepler_L2.

edit: If they cut XGMI out, then the size of N21 would be ~313mm2 at 7nm.
AMD invested a lot of transistors in N31. RDNA3 CU(WGP) will pack a lot more transistors than before.

DisEnchantment · Jul 20, 2022

TESKATLIPOKA said:
3DCenter.org

This should confirm N31 has a lot more transistors in a similar die size.
BTW only GCD is 5nm, memory chiplets should be 6nm based on info from Kepler_L2.

They should cut XGMI too from that since N31 has no XGMI.

TESKATLIPOKA · Jul 20, 2022

Silicon cost: N31 vs ADA102 (only my assumptions)
N5[mm2]: 350 vs 600
N6[mm2]: 6x 40 vs 0
N5 = 20,000 $
N6 = 12,000 $

Chiplet: 1522 chips per 6nm wafer -> 9$ per chiplet -> 6*9 = 48$
N31: 162 chips per 5nm wafer -> 124$ per chip -> 124+48 = 172$
Ada102: 91 chips per 5nm wafer -> 220$ per chip

N31 silicon costs 48$ less, but packaging is certainly more expensive, the question is by how much.
If nothing else, AMD is saving a limited supply of N5 by using N6 for chiplets.

xpea · Jul 20, 2022

TESKATLIPOKA said:
Silicon cost: N31 vs ADA102 (only my assumptions)
N5[mm2]: 350 vs 600
N6[mm2]: 6x 40 vs 0
N5 = 20,000 $
N6 = 12,000 $

Chiplet: 1522 chips per 6nm wafer -> 9$ per chiplet -> 6*9 = 48$
N31: 162 chips per 5nm wafer -> 124$ per chip -> 124+48 = 172$
Ada102: 91 chips per 5nm wafer -> 220$ per chip

N31 silicon costs 48$ less, but packaging is certainly more expensive, the question is by how much.
If nothing else, AMD is saving a limited supply of N5 by using N6 for chiplets.

Add $40 to 45 for ~600mm2 substrate, interconnect process, extra packaging (6 vs 1) and extra process/test.
At the end, it's a draw in cost. Difference will be from yields but with TSMC D0 at 0.07 and die harvesting, it's another mood point.
So let's say they cost the same and call it a day

jpiniero · Jul 20, 2022

xpea said:
Add $40 to 45 for ~600mm2 substrate, interconnect process, extra packaging (6 vs 1) and extra process/test.

That seems too high but I agree that the cost is close enough to not matter. Especially since we are talking about $1800-$2k video cards.

Still wondering if we will see a dual die Radeon Pro.

DisEnchantment · Jul 20, 2022

TESKATLIPOKA said:
N6 = 12,000 $

Way too much for N6.
Some old estimations from Retired Engineer, N6 is cheaper than N7

https://twitter.com/x/status/1306437988801486848

While wafers cost are supposed to be going up, demand is ramping down for mature nodes.
But, you won't get the right values anyway. Also, each Designer also does not pay the same amount, it depends on the business relationship. So depends on the deal your purchasing and their sales agreed upon.

TESKATLIPOKA said:
If nothing else, AMD is saving a limited supply of N5 by using N6 for chiplets.

Wafer capacity, die reuse, hard macro reuse, e.g. N6 MC, IC just replicate your hard macro.

It would be naive to assume that these folks don't know what they are doing.

jpiniero · Jul 20, 2022

DisEnchantment said:
Way too much for N6.
Some old estimations from Retired Engineer, N6 is cheaper than N7

https://twitter.com/x/status/1306437988801486848

Again, N6 at the very least should not be any cheaper than N7. It is a better value because of the density increase but the wafer itself should not be cheaper.

Also that was before the price hikes.

DisEnchantment · Jul 20, 2022

jpiniero said:
Again, N6 at the very least should not be any cheaper than N7. It is a better value because of the density increase but the wafer itself should not be cheaper.

Also that was before the price hikes.

Noope, it takes fewer lithography steps to fab on N6 than N7, lesser multi patterning. They need NTO that is all.

jpiniero · Jul 20, 2022

DisEnchantment said:
Noope, it takes fewer lithography steps to fab on N6 than N7, lesser multi patterning. They need NTO that is all.

EUV tools probably a lot more expensive, etc, etc. Even if it was cheaper for TSMC, that doesn't mean they would pass on the savings.

TESKATLIPOKA · Jul 20, 2022

DisEnchantment said:
Wafer capacity, die reuse, hard macro reuse, e.g. N6 MC, IC just replicate your hard macro.

It would be naive to assume that these folks don't know what they are doing.

I think you misunderstood me. I think It's a good thing that they used N6 for those chiplets.
Even If at worst the manufacturing cost was the same, this way they didn't need to use N5 wafers for chiplets, which could be used for CPUs or GPUs(GCD) instead. Afterall AMD has a limited supply(allocation) of 5nm wafers.

P.S. the price of a single wafer was just my guess, so I could calculate the difference between N31 and ADA102, nothing more.

Olikan · Jul 20, 2022

jpiniero said:
That seems too high but I agree that the cost is close enough to not matter. Especially since we are talking about $1800-$2k video cards.

Still wondering if we will see a dual die Radeon Pro.

At this point, i doubt the extra cost is more than $15, this is an old tech for AMD... they could easily made each chiplet larger, say 128-bit bus, the complexity would be halved, and would still had a good flexibility

If anything, the tiny size of the chiplet means the savings of a yield defect in the a memory controler is worth more than the complexity costs

DisEnchantment · Jul 20, 2022

jpiniero said:
EUV tools probably a lot more expensive, etc, etc. Even if it was cheaper for TSMC, that doesn't mean they would pass on the savings.

How do you think TSMC is asking customers to do NTO, do you think TSMC will absorb the cost of migrating to new PDK?
Customer has to do NTO to get the benefits of the reduced lithography steps.
That means it is a mutual benefit to do N6, not just for TSMC but also for the customer.
Otherwise customer stays on N7 and TSMC incur higher cost to produce the same amount of wafers.

TESKATLIPOKA · Jul 20, 2022

I made N33 based on RDNA2 and DisEnchantment's chart.
2SE, 32WGP, 64 CU, 64 ROPs, 64MB IC, 128bit MC on 7nm process

edit: I just wanted to show how big would be a similar RDNA2 chip on 7nm.

leoneazzurro · Jul 20, 2022

Should not it be on 6N instead of 7N (if on that node at all)?

jpiniero · Jul 20, 2022

DisEnchantment said:
Otherwise customer stays on N7

In practice that's what looks like what is happening. Since the two share the same design rules, there's little reason to use N7 for new products over N6 but also not much reason to bother to convert unless you are going to do refresh like products. Given that EUV tools are still hard to come by, it's probably not a bad thing.

maddie · Jul 20, 2022

TESKATLIPOKA said:
I made N33 based on RDNA2 and DisEnchantment's chart.
2SE, 32WGP, 64 CU, 64 ROPs, 64MB IC, 128bit MC on 7nm process
View attachment 64805

edit: I just wanted to show how big would be a similar RDNA2 chip on 7nm.

Where is the info for 64MB IF cache? I thought it is rumored to be 128MB.

TESKATLIPOKA · Jul 20, 2022

maddie said:
Where is the info for 64MB IF cache? I thought it is rumored to be 128MB.

First time hearing It.
For 1080-1440p you don't need 128MB, N33 has only 8GB Vram.

edit: 64 MB more IC would mean a larger die by 44 mm^2 at 7nm.

jpiniero · Jul 20, 2022

TESKATLIPOKA said:
First time hearing It.
For 1080-1440p you don't need 128MB, N33 has only 8GB Vram.

128 MB would help with the much lower memory bandwidth.

maddie · Jul 20, 2022

TESKATLIPOKA said:
First time hearing It.
For 1080-1440p you don't need 128MB, N33 has only 8GB Vram.

You might be right on the size if this is a x600 model and as cheap as many are now thinking it's designed to be. I checked the cache hit-rate curves, and they are not too much lower with 64MB at 1080p & 1440p, plus with the talk of IF cache 2.0, there might be design changes that remedy the performance drop. Saving 44mm^2 matters a lot here.

Question Speculation: RDNA3 + CDNA2 Architectures Thread

Platinum Member

Golden Member

Lifer

Platinum Member

Senior member

Golden Member

Senior member

Platinum Member

Golden Member

Platinum Member

Senior member

Lifer

Golden Member

Lifer

Golden Member

Lifer

Platinum Member

Platinum Member

Golden Member

Platinum Member

Golden Member

Lifer

Diamond Member

Platinum Member

Lifer

Diamond Member