Question Speculation: RDNA3 + CDNA2 Architectures Thread

Page 91 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

uzzi38

Platinum Member
Oct 16, 2019
2,705
6,427
146

Shmee

Memory & Storage, Graphics Cards Mod Elite Member
Super Moderator
Sep 13, 2008
7,741
2,717
146
Seems to me that some cards, such as Red Devil or Nitro+/Toxic models might have the opportunity to have a 3rd 8 pin? With a custom PCB. This could allow for more OC room and performance. Anyway, glad to see the 8 pin connectors and no 12 pin in sight.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,747
6,598
136
We summarized here before

The could be bottlenecked by the vector L0. But, I have seen @Kepler_L2 already mentioned doubled L0.
Fortunately for RDNA, the IFC/MALL also works very well and is going to be in its second generation.
To be seen how they address BW/Latency issues and shader occupancy, divergent memory access.

Several interesting things on N31/2 already in open source commits
  • NGG only geometry pipeline
  • OREO
  • Unified GFX ring/MES only schedule
  • 1 Cycle Wave64
  • WMMA
  • True 16bit ops/similar to mobile
  • Many new low precision vector ops, derived from CDNA2+/GFX940
  • CDNA2+ features like Architected flat scratch/packed work item ids
  • End to End DCC

  • NGG only geometry pipeline
  • OREO
  • Unified GFX ring/MES only schedule --> HW Schedule mainly
  • 1 Cycle Wave64 (when supported by 6 VGPR banks)
  • WMMA
  • True 16bit ops/similar to mobile --> I think this one should be very interesting for Android. VGPR 0-127 (of each bank) can handle dual 16 bit ops
  • Many new low precision vector ops, derived from CDNA2+/GFX940
  • CDNA2+ features like Architected flat scratch/packed work item ids
  • End to End DCC
  • Dual Issue (opportunistic for N33/always for N31) for VOP3 ops (VOPD)
  • All < 3 operand ops VOPC/VOP2/VOP1 can be dual issued.
I think RDNA3 has a HW BVH traversal stage.
 

alexruiz

Platinum Member
Sep 21, 2001
2,836
556
126
In case you werent aware, top tier cards are not produced in same volumes as lower tier cards. There were not that many 4090s produced initially. Therefore easy for a new, just released flagship product to go out of stock.

I am perfectly aware of the product mixes.
The point is about "who is going to buy them?"
There are people buying them. Even if it is only 30 per store, the top parts will sell out quickly.

Considering that demand of top video cards tends to be inelastic regarding of price, I would even speculate that any GPU of the newer generation that outperforms the top current one while staying close in value for the money will sell well.
The ones that are more elastic are the lower tiers, those ones will need to outperform their predecessors at the same price point to sell decently.
 
  • Like
Reactions: Leeea

Stuka87

Diamond Member
Dec 10, 2010
6,240
2,559
136
I still think it would be really strange for them to launch with a 7950 model, instead of just the base 7900. Unless they are going back to the 5K-7K naming scheme of 7950 for cut down, and 7970 for full die.

But if they stick with the naming of the other RDNA cards, this will be a 7900, and then down the road, there will be a 7950 refresh.
 
  • Like
Reactions: Tlh97 and Leeea

A///

Diamond Member
Feb 24, 2017
4,351
3,158
136
7900Xt matches the 4090 and is priced at 1100, the xtx with 4 GB extra vram will be more powerful than the 7900xt and pass the 4090. No real answer for dlss 3. Rt should be a lot better, maybe equal to nvidia. xtx will be priced at 1300-1400. Both use close but less power to the nvidia 4090.

Other amd cards like the 7800xt wipe the floor with nvidia.

Seems to me that some cards, such as Red Devil or Nitro+/Toxic models might have the opportunity to have a 3rd 8 pin? With a custom PCB. This could allow for more OC room and performance. Anyway, glad to see the 8 pin connectors and no 12 pin in sight.
yes. 8 pins are rated for 150 watts but can pull more temporarily. 150 watts is the safe pull rate. aic can use a third 8 pin for more power depending on what there's left to tap out of the dies. is it ugly? yes it is but it's a lot safer than the melted plastic and fires plaguing their competitor.
 

Kaluan

Senior member
Jan 4, 2022
504
1,074
106
So where are all the RTX 4090s?
OOS everywhere.
OOS claims literally mean nothing without shipment/stock volume data.

Jeez, we've just been through one big lesson of how market manipulation works over the last 2 years, it's sad to see some have learned nothing.

And it's not like we don't have confirmation newb scalpers hedged on this launch being another 30/6000 series gold mine as well.
We summarized here before



  • NGG only geometry pipeline
  • OREO
  • Unified GFX ring/MES only schedule --> HW Schedule mainly
  • 1 Cycle Wave64 (when supported by 6 VGPR banks)
  • WMMA
  • True 16bit ops/similar to mobile --> I think this one should be very interesting for Android. VGPR 0-127 (of each bank) can handle dual 16 bit ops
  • Many new low precision vector ops, derived from CDNA2+/GFX940
  • CDNA2+ features like Architected flat scratch/packed work item ids
  • End to End DCC
  • Dual Issue (opportunistic for N33/always for N31) for VOP3 ops (VOPD)
  • All < 3 operand ops VOPC/VOP2/VOP1 can be dual issued.
I think RDNA3 has a HW BVH traversal stage.

It's missing this:

Towards which I have a question:
Can this help confirm mid-gen console refresh (or next gen) will feature RDNA3 tech? Or is the cache directly addressable by developers part "viewable" as just an expected/natural progression of last level cache like that? Reminds of the ESRAM on XBox One in a sense.
 

A///

Diamond Member
Feb 24, 2017
4,351
3,158
136
So where are all the RTX 4090s?
OOS everywhere.
I guess those people buying it didn't hear about the recession.
The cards at the top will sell well if the performance delivers
top tier cards aren't made in volume. Scalpers bought most of the cards and began selling them on ebay. nvidia has allegedly halted 4090 production so more of their fab time can go to their enterprise products to meet sales before the china export ban takes place. i mentioned this pages ago. Most places have not gotten in any stock since their initial shipment.
 

Kaluan

Senior member
Jan 4, 2022
504
1,074
106
YT "premiere" thingy is already up on AMD's channel, almost 3 days in advance. The previous/Raphael one only "premiered" 1 or 2 hours before the actual reveal.

Can someone please check the metadata and see how long the announcement video is?


HYPE
 
  • Like
Reactions: lightmanek

Kaluan

Senior member
Jan 4, 2022
504
1,074
106
Certain tech sites had the raphael premiere link up before amd's official youtube made it present.
Yes, links towards content that didn't exist yet AFAIK.
Anyway it doesn't matter, I just found it peculiar. Seems the final touches/editing is (confirmed) done. Now we're just waiting for the date to be (RDNA) the 3rd. 😅
 

A///

Diamond Member
Feb 24, 2017
4,351
3,158
136
Yes, links towards content that didn't exist yet AFAIK.
Anyway it doesn't matter, I just found it peculiar. Seems the final touches/editing is (confirmed) done. Now we're just waiting for the date to be (RDNA) the 3rd. 😅
You can set up a future live stream whenever you want down to 5 minutes before the event but hide it through unlisted video. you can send the link out to others. You won't find it without the link. this different from a private link that you need to be authorized to view.
 
  • Like
Reactions: Kaluan

KompuKare

Golden Member
Jul 28, 2009
1,191
1,487
136
7900Xt matches the 4090 and is priced at 1100, the xtx with 4 GB extra vram will be more powerful than the 7900xt and pass the 4090. No real answer for dlss 3. Rt should be a lot better, maybe equal to nvidia. xtx will be priced at 1300-1400. Both use close but less power to the nvidia 4090.



But why should they have an answer for fake-frames DLSS 3.0 aside from answering Nvidia's hype machine?
What possible reason use does anyone have for higher but faked frames which add latency?
 

Tuna-Fish

Golden Member
Mar 4, 2011
1,486
2,023
136
Something I don't really see anyone discussing is the return of the XTX moniker. I think that alone denotes that AMD has a winner on its hands.

Is there actually anything from AMD about the XTX moniker? I thought that people just speculate it's going to be there because there will likely be 3 SKUs of the top die, instead of 2. (Full with vCache, full, clipped.)

If that is so, I'd expect a new suffix for the vCache product, instead of recycling XTX. Something that ties into the X3D used with CPUs. XT3D?
 

Saylick

Diamond Member
Sep 10, 2012
3,532
7,859
136
So I dug into that Twitter thread that @uzzi38 mentioned, and it seems like RDNA 3 does include some sort of ray coherency sorting just like Nvidia and Intel. Nvidia calls it Shader Execution Reordering, Intel calls it Thread Sorting, and I guess AMD calls it Ray Arbiter. Either way, if this is all true, then it explains how AMD are able to catch up faster in RT performance over just the increase in number of FP units.

1667264623797.png

My apologies for linking to a Tweet with profanity.

 
Last edited by a moderator:
  • Like
Reactions: Kaluan and Tlh97

Yosar

Member
Mar 28, 2019
28
136
106
I am perfectly aware of the product mixes.
The point is about "who is going to buy them?"
There are people buying them. Even if it is only 30 per store, the top parts will sell out quickly.

I can buy 4090 in my country, no problem. From the first launch day to current day. You know just for over 2000 EUR. Not many eager to pay.
Anecdotal? Yes. True? Either.
If this is nVidia's plan for business they are doomed (joking).
 

beginner99

Diamond Member
Jun 2, 2009
5,233
1,610
136
So where are all the RTX 4090s?
OOS everywhere.
I guess those people buying it didn't hear about the recession.
The cards at the top will sell well if the performance delivers
As far as I remember NV reduced or stopped production entirely and at some point there were rumors of them selling some of their wafer allocation. In essence the "shortage" is fake and intentional to make 3000 series look good and get rid of stock.
 

Tigerick

Senior member
Apr 1, 2022
696
602
106
GeForce RTX 4080Radeon RX 7900XTGeForce RTX 4090Radeon RX 7900XTX
SRP$1,199$899$1,599$999
CUDA / SP97285376163846144
Boost Clock2.51 GHz2.4 GHz2.52 GHz2.5 GHz
Memory Size256-bit 16GB GDDR6X
23 Gbps
320-bit 20GB GDDR6
20 Gbps
384-bit 24GB GDDR6X
21 Gbps
384-bit 24GB GDDR6
20 Gbps
Memory BW736 GB/sec800 GB/sec1008 GB/sec960 GB/sec
Power320 W300W450 W355 W
RT Cores768412896
Tensor / AI Cores304168512192
Cache64 MB80 MB72 MB96 MB
Extra FeaturesDLSS3FSR3DLSS3FSR3

Above is comparison chart between AMD and nVidia's flapship GPU for 2023. Will be interesting to watch how much AMD going to price RX7900XT since some specs are better than RTX4080. We may see nVidia adjusting price of RTX 4080 before they start selling it; Happy to see another slap on Jensen's face :p
 
Last edited:

KompuKare

Golden Member
Jul 28, 2009
1,191
1,487
136
Single player games.
Have to say that I only play single player games but I have never though to myself:
"If only this added more fake frames between each real one, things would look smoother."
I really see zero or less value in DLSS 3.0 - certainly at the moment with the artifacts and in the future too unless the DL AI becomes truly sentient and plays the game for me. And even the, if I wanted to watch gaming footage I could go to youtube!
 

Stuka87

Diamond Member
Dec 10, 2010
6,240
2,559
136
GeForce RTX 4080Radeon RX 7900 ?GeForce RTX 4090Radeon RX 7900XT ?
SRP$1,199$999 ?$1,599$1,299 ?
CUDA / SP972810240 - 10752 ?1638410752 - 12288 ?
Boost Clock2.51 GHz?2.52 GHz?
Memory Size256-bit 16GB GDDR6X
23 Gbps
320-BIT 20GB GDDR6
20 Gbps
384-bit 24GB GDDR6X
21 Gbps
384-bit 24GB GDDR6X
20 Gbps
Memory BW736 GB/sec800 GB/sec1008 GB/sec960 GB/sec
Power320 W?450 W?
RT Cores76?128?
Tensor Cores304?512?
Cache64 MB80 MB72 MB96 MB
Extra FeaturesDLSS3FSR3 ?DLSS3FSR3 ?
PCIe InterfaceGen 4 - 31.5 GB/sGen 5 - 63 GB/sGen 4 - 31.5 GB/sGen 5 - 63 GB/sec

It would be odd for two cards with the same GPU to have different memory bandwidths (320 vs 384).

And if the XT is using GDDR6x, shouldn't it have more than 20Gbps of bandwidth? Otherwise, why waste the price/power on 6x if it has the same bandwidth as the standard GDDR6 on the non-XT?
 

Timorous

Golden Member
Oct 27, 2008
1,748
3,240
136
It would be odd for two cards with the same GPU to have different memory bandwidths (320 vs 384).

And if the XT is using GDDR6x, shouldn't it have more than 20Gbps of bandwidth? Otherwise, why waste the price/power on 6x if it has the same bandwidth as the standard GDDR6 on the non-XT?

For N31 it makes sense. 5x 64bit MCDs is 320 and 6x 64bit MCDs is 384. That is an advantage of the chiplet design. if you are going to cut the memory bus you use less silicon.

I presume the GDDR6X is just a typo.