Discussion RDNA4 + CDNA3 Architectures Thread

DisEnchantment · Mar 23, 2022

With the GFX940 patches in full swing since first week of March, it is looking like MI300 is not far in the distant future!
Usually AMD takes around 3Qs to get the support in LLVM and amdgpu. Lately, since RDNA2 the window they push to add support for new devices is much reduced to prevent leaks.
But looking at the flurry of code in LLVM, it is a lot of commits. Maybe because US Govt is starting to prepare the SW environment for El Capitan (Maybe to avoid slow bring up situation like Frontier for example)

See here for the GFX940 specific commits

History for llvm/lib/Target/AMDGPU - llvm/llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. - History for llvm/lib/Target/AMDGPU - llvm/llvm-project

github.com

Or Phoronix

More AMD "GFX940" Enablement Work Landing In LLVM - Phoronix

www.phoronix.com

There is a lot more if you know whom to follow in LLVM review chains (before getting merged to github), but I am not going to link AMD employees.

I am starting to think MI300 will launch around the same time like Hopper probably only a couple of months later!
Although I believe Hopper had problems not having a host CPU capable of doing PCIe 5 in the very near future therefore it might have gotten pushed back a bit until SPR and Genoa arrives later in 2022.
If PVC slips again I believe MI300 could launch before it

This is nuts, MI100/200/300 cadence is impressive.

Previous thread on CDNA2 and RDNA3 here

Question - Speculation: RDNA3 + CDNA2 Architectures Thread

Man I have been dying to make this one for a while now. First rumours for RDNA3 are here so new thread time! Just going to start off with this one for now: kopite7kimi on Twitter: "@VideoCardz Ah, I mean a simple mcm design with 10240 cores is not enough. Because the lift from RDNA2 to RDNA3...

forums.anandtech.com

Joe NYC · Aug 16, 2023

TESKATLIPOKA said:
What would that entail?
Higher performance?
Higher perf/$?
Higher perf/W?
Or everything mentioned above?
And by how much? 20%, 25%, 33% or more?

200-250mm2 die size isn't that bad.
I would say 2x higher specs compared to N33 should be achievable, but after n31 I am not really sure. It consumed a ridiculous number of transistors for those specs. We will see, I think AMD will want to release next year.

I would expect
- 0 - 10 % performance gain
- 2 generation worth of perf / Watt improvements
- early release

TESKATLIPOKA · Aug 16, 2023

adroc_thurston said:
Xtor counts can be whatever you imagine them to be.
Die/total Si area is what matters.

The more transistors a chip has, the bigger It is. N31 is very big for Its specs.

N33 -> 13.3B transistors
N31 -> 45.7B + 12.3B = 58B transistors
4.36x more transistors for exactly 3x higher specs.

For comparison:
N23 -> 11.06B transistors
N21 -> 26.8B transistors
2.42x more transistors for 2.5x more CU,SHADERs,TMUs; 2x more ROPs, 2x wider bus and 4x more IC.

adroc_thurston · Aug 16, 2023

TESKATLIPOKA said:
The more transistors a chip has, the bigger It is.

There are nontrivial implementation details involved here.
All that matters is resulting die are and then cost per die yielded (salvage accounted for).

TESKATLIPOKA said:
4.36x more transistors for exactly 3x higher specs.

Yes N31/32 are actually dense by N5 standards.

TESKATLIPOKA · Aug 16, 2023

adroc_thurston said:
There are nontrivial implementation details involved here.
All that matters is resulting die are and then cost per die yielded (salvage accounted for).

Yes N31/32 are actually dense by N5 standards.

N31/N32 have high density, but that's not the problem.

N32 will have lower performance than N21, but still have a lot more transistors.
Something is very bad here.

adroc_thurston · Aug 16, 2023

TESKATLIPOKA said:
N32 will have lower performance than N21, but still have a lot more transistors.

Xtor counts are whatever you imagine them to be.
Power/perf/area is all that matters.

TESKATLIPOKA · Aug 16, 2023

adroc_thurston said:
Xtor counts are whatever you imagine them to be.
Power/perf/area is all that matters.

You could directly port Navi21 to N5, and It will have better Power/perf/area than N32.
It would be ~250-260mm2.

adroc_thurston · Aug 16, 2023

TESKATLIPOKA said:
and It will have better Power/perf/area than N32

No.

TESKATLIPOKA said:
It would be ~250-260mm2.

N5 isn't a 2x shrink period.

TESKATLIPOKA · Aug 17, 2023

adroc_thurston said:
No.

Yes.
N32 with those clocks will manage 6800XT level of performance.
6950XT is ~14% faster in comparison at 4K.
I don't think N32 will have less than 225W, I wouldn't even be surprised If It had actually 250W.
6950XT shrunk to N5 should manage 250W, so It's still better than N32.

P.S.
If you want to include a supposedly much higher clockspeed(up to 3.5GHz) at comparable power, which didn't happen, then you would be correct. I will ignore It until RDNA3.5 proves It's true.
Still, that would leave the question of why N32/N31 need so much more transistors than N33, when the difference is only chiplet design and a bit beefed up WGP.

adroc_thurston said:
N5 isn't a 2x shrink period.

RDNA2 on 7nm process.
Navi 23 -> 46.7MTr/mm2 (11,060MTr / 237mm2)
Navi 22 -> 51.3MTr/mm2 (17,200MTr / 335mm2)
Navi 21 -> 51.5MTr/mm2 (26,800MTr / 520mm2)

Ada Lovelace on 4N, which should be not much better than what AMD is using in my opinion.
Ada106 -> 120.5MTr/mm2 (22,900MTr / 190mm2)
Ada104 -> 121.6MTr/mm2 (35,800MTr / 294.5mm2)
Ada103 -> 121.2MTr/mm2 (45,900MTr / 378.6mm2)
Ada102 -> 125.4MTr/mm2 (76,300MTr / 608.5mm2)

I don't see why I couldn't get 103MTr/mm2 out of a N21 shrink to N5 when Nvidia managed a 17-22% more on a not much better process.

edit: seriously wrong comparison.

adroc_thurston · Aug 17, 2023

TESKATLIPOKA said:
6950XT shrunk to N5 should manage 250W, so It's still better than N32

At a lot more die area.

TESKATLIPOKA said:
Still, that would leave the question of why N32/N31 need so much more transistors than N33

Xtor counts are a bogus metric, perf/power/area is all that matters.

TESKATLIPOKA said:
RDNA2 on 7nm process.
Navi 23 -> 46.7MTr/mm2 (11,060MTr / 237mm2)
Navi 22 -> 51.3MTr/mm2 (17,200MTr / 335mm2)
Navi 21 -> 51.5MTr/mm2 (26,800MTr / 520mm2)

Ada Lovelace on 4N, which should be not much better than what AMD is using in my opinion.
Ada106 -> 120.5MTr/mm2 (22,900MTr / 190mm2)
Ada104 -> 121.6MTr/mm2 (35,800MTr / 294.5mm2)
Ada103 -> 121.2MTr/mm2 (45,900MTr / 378.6mm2)
Ada102 -> 125.4MTr/mm2 (76,300MTr / 608.5mm2)

are you seriously comparing two different uarches from two different vendors for density? are you insane?
I know people here are mostly clueless but you're pushing the very boundaries of silly season.

Joe NYC · Aug 17, 2023

TESKATLIPOKA said:
You could directly port Navi21 to N5, and It will have better Power/perf/area than N32.
It would be ~250-260mm2.

I think we can say with 100% certainty that no one at AMD is considering Navi21 shrink to N5.

Even if it N5 shrink of Navi 21 achieved better performance, that would be a side way move, burning AMD resources, while AMD needs use resources to go forward without significant detours.

TESKATLIPOKA · Aug 17, 2023

adroc_thurston said:
At a lot more die area.

Xtor counts are a bogus metric, perf/power/area is all that matters.

I don't use transistors as a metric per se.
I am just saying that transistors also matter, because depending on the amount, the chip size will be different in size using the same process. The bigger the more It costs to produce at the same process.

adroc_thurston said:
are you seriously comparing two different uarches from two different vendors for density? are you insane?
I know people here are mostly clueless but you're pushing the very boundaries of silly season.

Ok, I thought about It more and It's nonsense.

The other option was RDNA 2 vs RDNA3, but It was monolith and chiplet using different processes so I didn't want to use It.

I could average It.
5nm GCD: 45.7B and 306mm2 -> 149.3 MTr/mm2
6nm MCD: 2.05B and 37.5mm2 -> 54.7MTr/mm2
Total: 58B transistors and 531mm2 -> 109MTr/mm2 average

This already has higher density than I needed, but I just remembered that RDNA3 WGP has 2.66x(+166%) higher density.

This can't be attributed simply to a better process, and this was also a reason why I agree that comparing different vendors was wrong.
I also agree that N21 ported to N5 wouldn't be only 250-260mm2.

RTX2080 · Aug 17, 2023

An unknown amd (discrete) GPU on N4 had been sent to factory for early testing few days ago. Not production purpose.
This post is just for a record and might be testified months later, or might be not.

soresu · Aug 17, 2023

Joe NYC said:
Hopefully, AMD will get to at least a worthwhile successor to 7700 / 7800 with RDNA4

Assuming the rumors are correct it (which would be odd so far from release) seems more likely x700 SKU will be the limit for RDNA4, with RDNA5 giving either a full spread of SKUs like RDNA2 over RDNA1, or a high end focus like Vega had vs Polaris.

Given that RDNA3 didn't even produce a decent replacement for the 6600 SKUs they still have plenty of market segments to fill with RDNA4.

Unless RDNA4 produces anything mindblowing in the HWRT area I'll be hanging onto my 6600 XT and waiting for a 9800 or 9800 XT - would gel pretty nicely with my very first ATi gfx card purchase back in 2003 😁

adroc_thurston · Aug 17, 2023

TESKATLIPOKA said:
I am just saying that transistors also matter, because depending on the amount, the chip size will be different in size using the same process

It's just an implementation detail.

Joe NYC · Aug 17, 2023

soresu said:
Assuming the rumors are correct it (which would be odd so far from release) seems more likely x700 SKU will be the limit for RDNA4, with RDNA5 giving either a full spread of SKUs like RDNA2 over RDNA1, or a high end focus like Vega had vs Polaris.

Given that RDNA3 didn't even produce a decent replacement for the 6600 SKUs they still have plenty of market segments to fill with RDNA4.

Unless RDNA4 produces anything mindblowing in the HWRT area I'll be hanging onto my 6600 XT and waiting for a 9800 or 9800 XT - would gel pretty nicely with my very first ATi gfx card purchase back in 2003 😁

You are right, it may only go up to x700 class.

TESKATLIPOKA · Aug 17, 2023

Videocardz

Some renders of the canceled RDNA4 monster GPU.

Joe NYC · Aug 17, 2023

TESKATLIPOKA said:
Videocardz

Some renders of the canceled RDNA4 monster GPU.

These renders are missing another silicon bridge from MID to AID.

Mopetar · Aug 17, 2023

Joe NYC said:
These renders are missing another silicon bridge from MID to AID.

It looks like they sit on top of each other so the bridge is baked in to both layers.

Joe NYC · Aug 17, 2023

Mopetar said:
It looks like they sit on top of each other so the bridge is baked in to both layers.

You can take a look again at the original picture.

I think the chiplets communicate with each other only through the bridges. So no micro-bumps in the way of any internal communication.

Microbumps only to substrate to get power and external I/O.

Mopetar · Aug 17, 2023

I had only looked at the picture in the post, so I'm clearly less informed on anything else in the article. Seems like an odd design for a consumer GPU.

Joe NYC · Aug 17, 2023

Mopetar said:
I had only looked at the picture in the post, so I'm clearly less informed on anything else in the article. Seems like an odd design for a consumer GPU.

Well, it is a design for a modular GPU, that is scalable, and can scale to performance that a monolithic GPU will have hard time to achieve.

adroc_thurston · Aug 17, 2023

Joe NYC said:
Microbumps only to substrate to get power and external I/O.

No those are standard C4.

GodisanAtheist · Aug 17, 2023

Damn, looks like NV dodged a bullet on the next gen...

Appreciate AMD's moonshot attempts vs NV. Yeah it looks like the lander smashed against the surface for RDNA 3 and never left the launchpad for RDNA4, but given AMD is going to loiter at 15% marketshare matching to losing to NV one way or the other, I guess it only makes sense to either try and decisively beat them or stick with the mid/entry range if beating them isn't working out.

Joe NYC · Aug 17, 2023

adroc_thurston said:
No those are standard C4.

I see, missed that in the picture.

soresu · Aug 17, 2023

adroc_thurston said:
No those are standard C4.

Sounds like a really explosive design 😂

Discussion RDNA4 + CDNA3 Architectures Thread

Golden Member

Diamond Member

Platinum Member

Diamond Member

Platinum Member

Diamond Member

Platinum Member

Diamond Member

Platinum Member

Diamond Member

Diamond Member

Platinum Member

Senior member

Diamond Member

Diamond Member

Diamond Member

Platinum Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member