Discussion RDNA4 + CDNA3 Architectures Thread

DisEnchantment · Mar 23, 2022

With the GFX940 patches in full swing since first week of March, it is looking like MI300 is not far in the distant future!
Usually AMD takes around 3Qs to get the support in LLVM and amdgpu. Lately, since RDNA2 the window they push to add support for new devices is much reduced to prevent leaks.
But looking at the flurry of code in LLVM, it is a lot of commits. Maybe because US Govt is starting to prepare the SW environment for El Capitan (Maybe to avoid slow bring up situation like Frontier for example)

See here for the GFX940 specific commits

History for llvm/lib/Target/AMDGPU - llvm/llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. - History for llvm/lib/Target/AMDGPU - llvm/llvm-project

github.com

Or Phoronix

More AMD "GFX940" Enablement Work Landing In LLVM - Phoronix

www.phoronix.com

There is a lot more if you know whom to follow in LLVM review chains (before getting merged to github), but I am not going to link AMD employees.

I am starting to think MI300 will launch around the same time like Hopper probably only a couple of months later!
Although I believe Hopper had problems not having a host CPU capable of doing PCIe 5 in the very near future therefore it might have gotten pushed back a bit until SPR and Genoa arrives later in 2022.
If PVC slips again I believe MI300 could launch before it

This is nuts, MI100/200/300 cadence is impressive.

Previous thread on CDNA2 and RDNA3 here

Question - Speculation: RDNA3 + CDNA2 Architectures Thread

Man I have been dying to make this one for a while now. First rumours for RDNA3 are here so new thread time! Just going to start off with this one for now: kopite7kimi on Twitter: "@VideoCardz Ah, I mean a simple mcm design with 10240 cores is not enough. Because the lift from RDNA2 to RDNA3...

forums.anandtech.com

marees · Feb 23, 2025

gaav87 said:
View attachment 117921 View attachment 117922

I didn't expect that. A clean win over the 4070 ti super in RT

I predict a shareholder coup if this thing launches for less than $700

Heartbreaker · Feb 23, 2025

coercitiv said:
Now they're paying TSMC by transistor count? 🙂

It's become more useful for comparisons than die size when using different process nodes, as node related transistor cost reductions have stagnated.

So designs with a lot more transistors usually rise in cost.

Josh128 · Feb 23, 2025

PJVol said:
96 rdna3 cu's / 64 rdna4 cu's - clocks improvement = "nearly 50%".
Just simple, Annalena B. 360° math.

lol, just as I said. +42% vs 80 CU 7900GRE over a comparison of 30+ games with almost half of those comprised of RT comparisons, which give an additional ~20%-40% boost over straight raster. And that is with ~38% higher boost clocks.

Kepler_L2 · Feb 23, 2025

Heartbreaker said:
It's become more useful for comparisons than die size when using different process nodes, as node related transistor cost reductions have stagnated.

So designs with a lot more transistors usually rise in cost.

You pay for wafers not transistors. The fact that N48 is like 2x the transistor density of B580 doesn't change the price.

Jan Olšan · Feb 23, 2025

gaav87 said:
View attachment 117921 View attachment 117922

Whose numbers do you based it on?
When I first saw the numbers I was trying to "plot" it against 5070 Ti based on TPU's and CoputerBase's graphs from 5070 Ti reviews but 9070 XT was tracking as being below 5070ti.

Thanks for doing the math btw.

(Edit: I have just retried it against TPU's review of the Palit card what has lowest OC and it still tells me 9070 XT will be below 5070 Ti on average from the games TPU had - I didn't want to put in more sources to make it even less reliable anyway.)

gaav87 · Feb 23, 2025

China leaked price 549$ non xt 649$ xt

gaav87 · Feb 23, 2025

Jan Olšan said:
Whose numbers do you based it on?
When I first saw the numbers I was trying to "plot" it against 5070 Ti based on TPU's and CoputerBase's graphs from 5070 Ti reviews but 9070 XT was tracking as being below 5070ti.

Thanks for doing the math btw.

All TPU review of 5070Ti or selected games.
RT 5games TPU 1 game HUB te rest i cba searching

exquisitechar · Feb 23, 2025

gaav87 said:
China leaked price 549$ non xt 649$ xt

Thought so. 5070ti - 100$ for worse RT performance (slightly worse raster performance too) and lack of DLSS4.

Kepler_L2 · Feb 23, 2025

gaav87 said:
China leaked price 549$ non xt 649$ xt

Has AMD actually confirmed it?

Heartbreaker · Feb 23, 2025

Kepler_L2 said:
You pay for wafers not transistors. The fact that N48 is like 2x the transistor density of B580 doesn't change the price.

More advanced Nodes cost more per wafer. That is the reason that transistor price reductions have stagnated. While the graph exaggerates the point, the steep down slope of the past is over.

gdansk · Feb 23, 2025

Heartbreaker said:
It's become more useful for comparisons than die size when using different process nodes, as node related transistor cost reductions have stagnated.

So designs with a lot more transistors usually rise in cost.

Zen 5 is many more transistors but same size. Cost per transistor decreased. It's plausible RDNA4 is DTCO cache maxxing in a similar fashion.

But I still think it is 380mm²+.

gaav87 · Feb 23, 2025

Kepler_L2 said:
Has AMD actually confirmed it?

Idk ask napoleon from chh he said same price as 5070 and xt is a surprise.

Heartbreaker · Feb 23, 2025

gdansk said:
Zen 5 is many more transistors but same size. Cost per transistor decreased. It's plausible RDNA4 is DTCO cache maxxing in a similar fashion.

But I still think it is 380mm²+.

There are things you can do to pack in more density, to make a bit better use of a node. But there isn't a endless supply of these, so you are still faced with the flattening of the old curve.

380mm²+ is still a big die on an expensive node.

adroc_thurston · Feb 23, 2025

Heartbreaker said:
380mm²+ is still a big die on an expensive node.

It's not that.

SolidQ · Feb 23, 2025

Kepler_L2 said:
Has AMD actually confirmed it?

if AMD asking media about price, i think AMD don't know final price yet

GTracing · Feb 23, 2025

Heartbreaker said:
It's become more useful for comparisons than die size when using different process nodes, as node related transistor cost reductions have stagnated.

So designs with a lot more transistors usually rise in cost.

Transistor count is a terrible metric for comparing designs.

First off, transistor counts can vary depending on how they're measured. Unless we know that two dies are measured the same way, the count could be off.

Secondly, chips can have varying levels of transistor density, even on the same node.

An extreme example of this would be Zen5 VS Zen5c. They have the same transistor count, but the die area is way different.

A more applicable example for graphics cards would be how B580 is way less dense than AMD or Nvidia GPUs. The B580 has less than 40% as many transistors as the 9070 XT, but the die is ~70-77% as big.

Transistor count should never be used to compare chips imo. It has too much margin for error.

Heartbreaker · Feb 23, 2025

GTracing said:
Transistor count is a terrible metric for comparing designs.

First off, transistor counts can vary depending on how they're measured. Unless we know that two dies are measured the same way, the count could be off.

Secondly, chips can have varying levels of transistor density, even on the same node.

An extreme example of this would be Zen5 VS Zen5c. They have the same transistor count, but the die area is way different.

A more applicable example for graphics cards would be how B580 is way less dense than AMD or Nvidia GPUs. The B580 has less than 40% as many transistors as the 9070 XT, but the die is ~70-77% as big.

Transistor count should never be used to compare chips imo. It has too much margin for error.

It's more useful than die size, if they are on different nodes.

If they are on the same node, then you can compare die size.

The problem is when people comparing die size across nodes, which is completely non comparable.

CastleBravo · Feb 23, 2025

marees said:
I didn't expect that. A clean win over the 4070 ti super in RT

I predict a shareholder coup if this thing launches for less than $700

It is only a clean win over the 4070 Ti Super if FSR4 upscaling matches DLSS4.

IMO, the right play for AMD is an "msrp" of $550-600, and an initial retail price of $700-750 for most AIB models for long as 5070 Ti is unobtanium.

coercitiv · Feb 23, 2025

Heartbreaker said:
It's more useful than die size, if they are on different nodes.

Navi 21 N7 ~27B transistors
Navi 48 N4 ~53B transistors, almost 100% increase

I agree that tracking transistor costs is useful, but using it for die costs estimates of different designs might lead to very weird results. 6900 XT launched for $1000, and it looks like N48 needs even higher price for similar margins. Even if we take 6800XT MSRP as guideline, the math still looks bad.

maddie · Feb 23, 2025

Heartbreaker said:
It's more useful than die size, if they are on different nodes.

If they are on the same node, then you can compare die size.

The problem is when people comparing die size across nodes, which is completely non comparable.

What about libraries and the varying densities? Die area, wafer cost and defect density, not simply transistor costs.

adroc_thurston · Feb 23, 2025

Heartbreaker said:
It's more useful than die size, if they are on different nodes

Uh, nope.
Foundry provides chip-level scaling metrics for a given node.
How close you'll get is up to you.

GTracing · Feb 23, 2025

Heartbreaker said:
It's more useful than die size, if they are on different nodes.

If they are on the same node, then you can compare die size.

The problem is when people comparing die size across nodes, which is completely non comparable.

The best way to compare costs across nodes is to multiply the die area by the cost-per-wafer.

Any comparison with a 50%+ margin for error is useless.

Tuna-Fish · Feb 23, 2025

coercitiv said:
6900 XT launched for $1000, and it looks like N48 needs even higher price for similar margins.

I really, really doubt that. For one, when 6900XT launched, the 16GB of GDDR6 alone cost closer to $300 than $200, now you have have it for ~$64.

marees · Feb 23, 2025

So what are the guesses / explanations on AMD achieving 60+% increase over the 7800xt despite being on the same process node 🤔

More cache !!??

https://twitter.com/x/status/1304489928995344384

adroc_thurston · Feb 23, 2025

marees said:
So what are the guesses / explanations on AMD achieving 60+% increase over the 7800xt despite being on the same process node 🤔

New microarchitecture.

marees said:
More cache !!??

Not really, it's different. How? no idea.

Discussion RDNA4 + CDNA3 Architectures Thread

Golden Member

Platinum Member

Diamond Member

Banned

Golden Member

Senior member

Senior member

Senior member

Senior member

Golden Member

Diamond Member

Diamond Member

Senior member

Diamond Member

Diamond Member

Golden Member

Senior member

Diamond Member

Member

Diamond Member

Diamond Member

Diamond Member

Senior member

Golden Member

Platinum Member

Diamond Member