Question Speculation: RDNA3 + CDNA2 Architectures Thread

Page 20 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

uzzi38

Platinum Member
Oct 16, 2019
2,625
5,895
146

Stuka87

Diamond Member
Dec 10, 2010
6,240
2,559
136
Obviously a CG representation, but looks like two dies, each having four HBM memory stacks. Not quite the chiplet design I had imagined. Then a question is, does each module show up as a single GPU?
 

DisEnchantment

Golden Member
Mar 3, 2017
1,601
5,780
136
Obviously a CG representation, but looks like two dies, each having four HBM memory stacks. Not quite the chiplet design I had imagined. Then a question is, does each module show up as a single GPU?
This one is interposer based only, it is like a high dandwidth version of EPYC Naples chiplets. The full 3D stacked SoIC versions will come with RDNA3 and MI300
 

leoneazzurro

Senior member
Jul 26, 2016
924
1,451
136

Better pictures.
 

Krteq

Senior member
May 22, 2015
991
671
136
MI200 has 95TF of fp32 performance. 🤯
I think it's TF32, not FP32

obrzek_2021-11-08_190p6ke5.png
 
Last edited:

Saylick

Diamond Member
Sep 10, 2012
3,127
6,304
136
Seems like it appears as 2 GPUs and isn't all that connected really. For sure wouldn't work for gaming.
Which is to be expected. Its clearly a compute only part.
I believe RDNA3 is going to use an embedded bridge to connect the GCDs, and not using regular IF links like how MI200 does. The silicon bridges in MI200 are between the GCD and HBM modules, which makes sense given the bandwidth required there. Same goes for RDNA3 between the GCDs.
 

Mopetar

Diamond Member
Jan 31, 2011
7,835
5,982
136
Some of the numbers are getting to the point of downright nutty. If you look at the BF/FP16 matrix numbers we're getting to the point where it's only a few more generations before we start having to measure the performance numbers in PFLOPs.
 

gdansk

Platinum Member
Feb 8, 2011
2,079
2,560
136
Some of the numbers are getting to the point of downright nutty. If you look at the BF/FP16 matrix numbers we're getting to the point where it's only a few more generations before we start having to measure the performance numbers in PFLOPs.
Another downright nutty figure is the TDP. In a few generations (maybe just one?) it'll be measured in kilowatts.
 

Saylick

Diamond Member
Sep 10, 2012
3,127
6,304
136
Another downright nutty figure is the TDP. In a few generations (maybe just one?) it'll be measured in kilowatts.
An unavoidable side effect of packing more and more silicon in the same package. It used to not be possible to jam this much silicon onto the package due to reticle limits, but MCM and other advanced packaging techniques eliminates that. As long as perf/W and perf/socket increases, increasing package power is of little consequence.
 
  • Like
Reactions: Tlh97

gdansk

Platinum Member
Feb 8, 2011
2,079
2,560
136
Meh, it is on an older node, and with 2 GPUs no less. My RTX 3090 peaks at around 420W, and can’t come close to these numbers, though the Instinct doesn’t use CUDA, so many would opt for NVIDIA anyway.
Isn't actually ponte vecchio rumored to be close to that?
I'm not saying it's AMD specific. It's an industry wide trend. Intel, Nvidia, AMD are all doing it in response to some demand.