Question Speculation: RDNA2 + CDNA Architectures thread

Page 118 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

uzzi38

Platinum Member
Oct 16, 2019
2,634
5,961
146
All die sizes are within 5mm^2. The poster here has been right on some things in the past afaik, and to his credit was the first to saying 505mm^2 for Navi21, which other people have backed up. Even still though, take the following with a pich of salt.

Navi21 - 505mm^2

Navi22 - 340mm^2

Navi23 - 240mm^2

Source is the following post: https://www.ptt.cc/bbs/PC_Shopping/M.1588075782.A.C1E.html
 

Olikan

Platinum Member
Sep 23, 2011
2,023
275
126
Well, with Zen 2 you got Epyc 2 chips containing 8 chiplets and thus up to 64 cores. Imagine that kind of scalability for GPU CUs without having to use a single ~500mm² monolith.
It's not that easy as CPUs, The terabytes of data crossing to each chiplet would require some "big" xGMI controlers, 8 compute gpu chiplet is probably gonna reach diminushing returns

While a MCM Navi10 of only 2 chips, both would be ~150mm2 (i added some redundancy)...

Even the big navi, the compute die would be lower than 400mm2, while still possible for agressive die harvesting, if needed...
 

Mopetar

Diamond Member
Jan 31, 2011
7,837
5,992
136
Using a chiplet design isn't just about scaling the number of cores to the point where even a monolithic die will struggle to utilize them, but about making it more cost effective to offer those products.

Imagine AMD had figured out a chiplet approach and makes 16 CU chiplets. Instead of needing three different dies for the Navi stack they can just release 1, 2, etc. chiplet products. It also makes the top end products easier to manufacture in a way that's online with demand. If your monolithic die only has 80 CUs then you need a perfect one for each of those cards you want to sell. With chiplets you just need 5 of them that don't have any defects out of the several hundred you get per wafer.

You can also bin chiplets individually to combine the best to hit higher clock speeds overall. A perfect monolithic die might have a few CUs that can't be pushed quite as far which limits the end result.

It's obvious there are obstacles to get to that point, but it's where AMD likely wants to go. It isn'tjust about being able to make some kind of monster GPU, but having a lot more flexibility in what you produce so that channel inventory can be managed more effectively.
 

Grooveriding

Diamond Member
Dec 25, 2008
9,108
1,260
126
If what they showed was their fastest new GPU, going out on a firm limb and guessing they're not as fast as the 3080, and obviously the 3090. If that was the smaller chip, then we're probably going to see some real competition again once they release their new cards.
 

YBS1

Golden Member
May 14, 2000
1,945
129
106
If what they showed was their fastest new GPU, going out on a firm limb and guessing they're not as fast as the 3080, and obviously the 3090. If that was the smaller chip, then we're probably going to see some real competition again once they release their new cards.
I'm thinking it's the smaller chip for two reasons.
1) The majority of the presentation was spent talking about the 5900X only to then drop the 5950X bomb at the very end. Now, we all knew for certain there was going to be a 5950X, but I'll bet all of us thought it was going to be staggered into the lineup later just as the 3950X was.
2) Lisa Su continuously played the "pronoun game" every single time she had to call the card something. Big Navi, Radeon 6000 series, etc. Never once did she say 6800, 6900, XT etc.
Oh, and
3)nVidia's 3080 pricing indicates they are....concerned.
 

Krteq

Senior member
May 22, 2015
991
671
136

Glo.

Diamond Member
Apr 25, 2015
5,711
4,556
136
RTX 2080 Ti - 50% faster than RX 5700 XT at 4K, mainly due to VRAM buffer limit on RX 5700 XT.

Navi 21 has 80 CUs, 256 bit memory bus, 2.2 GHz clock speeds, at the very least

2.2 GHz is 16% above what RX 5700 XT clocked. And RDNA2 GPU are supposed to have higher IPC, than RDNA1.

Solely on CU counts, and VRAM size buffer, Navi 21 should achieve 100% performance above RX 5700 XT.

And that is excluding IPC and clock speed differences.

And yet, AMD demoed a GPU that is 70% faster at 4K than RX 5700 XT.

Something does not add up.
 

Zstream

Diamond Member
Oct 24, 2005
3,396
277
136
I think you’re wrong on the chiplet design. Imagine
RTX 2080 Ti - 50% faster than RX 5700 XT at 4K, mainly due to VRAM buffer limit on RX 5700 XT.

Navi 21 has 80 CUs, 256 bit memory bus, 2.2 GHz clock speeds, at the very least

2.2 GHz is 16% above what RX 5700 XT clocked. And RDNA2 GPU are supposed to have higher IPC, than RDNA1.

Solely on CU counts, and VRAM size buffer, Navi 21 should achieve 100% performance above RX 5700 XT.

And that is excluding IPC and clock speed differences.

And yet, AMD demoed a GPU that is 70% faster at 4K than RX 5700 XT.

Something does not add up.

Because the IPC increase isn't there.
 

moinmoin

Diamond Member
Jun 1, 2017
4,952
7,663
136
mememe_bae21ddc512c7496c84137d7780530ea-1.jpg
 

Zstream

Diamond Member
Oct 24, 2005
3,396
277
136
I think you’re wrong on the chiplet design. A soc that is small enough it has two gpu’s but sharing the cache via an infinity cache mechanism? Seems pretty logical to me.
 

Panino Manino

Senior member
Jan 28, 2017
821
1,022
136
I was thinking, everyone is "concerned" about how big Big Navi is and if there's a Bigger Navi and the existence of that massive cache because of how much this will influence AMD capability to compete with Nvida on RT tasks but... does this even matter?

I remember that AMD had said that they would only deliver RT when they could deliver from top to bottom for everyone. With RDNA 2 AMD succeeds in realizing this goal, right? No matter how small the die, every RDNA 2 GPU will be able to do hardware-RT, something now guaranteed with Nvidia. And if the PS5 (and the SeX) is indicative enough, even without the Schrodinger cache RT performance is good enough. Isn't this what matters? That AMD will make the whole market RT-Ready?
It'll take time for games to start using RT for everything and heavily, the requirements will be limited so for better or worse AMD has no need to compete with Nvidia to win.
 

Konan

Senior member
Jul 28, 2017
360
291
106
Thinking...
  • "Limited Edition" and "XTX version" would be an AMD only made cards
  • The "Reference Edition" is what the AIB have been taking about to folks like Coreteks and a few others. Comes with the info of +15% above a 2080Ti, which kind of fits with what we just saw...
  • The next level up from what we saw will be a short run card

Kitty KorgiVideoCardz
XTX Liquid Edition <Maybe> Navi21 XTX 16GB (AMD exclusive?)
note: "NAVI 21 XTX - AMD's biggest secret, AIBs were not officially told about it yet, performance is unknown. Might launch as a reference-only model first. It has more CUs than Navi 21 XT."
XT Limited Edition
XT Reference Edition Navi21 XT 16GB (AIB - faster than 3070)
XL 16GBNavi21 XL 16GB (unknown)
Navi21 XE ?GB (2021)
 

Zstream

Diamond Member
Oct 24, 2005
3,396
277
136
I don't understand. If an integrated die (SOC) has a GPU, why split it into 2?
This would solve many of the multi-gpu problems. SFR is much easier to implement, but had huge issues with duplication. The unified cache solution would solve the duplication.

I also wouldn’t call it a single soc, but more of a mcm type with the IF as a crossbar if you will.

This is what Nvidia has researched but seemed not to have the unified cache. https://research.nvidia.com/sites/default/files/publications/ISCA_2017_MCMGPU.pdf
 

GodisanAtheist

Diamond Member
Nov 16, 2006
6,815
7,172
136
Maybe the presentation on the 28th goes something like this:

Wang/Su: "Ok,we gave you a sneak peak of 'Big Navi" with 60CU and 192 bit memory..."

Everyone at AT: ???? Wat

Wang/Su: "Oh, but one more thing before we go. Say hello to 'Bigger Navi' with 80CU and 256 bit bus..."

Everyone at AT: !!!! Oh those sneaky guys !!!!
 

Guru

Senior member
May 5, 2017
830
361
106
GPU's are already very scalable, each CU is essentially a CCX or a processor itself, however you want to call it. The issue comes down to feeding all the CU's and keeping them optimally used.

There is no need for chiplet design for GPU, at least not in the desktop market and similar. Super computers and such are a different story though.
 

Guru

Senior member
May 5, 2017
830
361
106
RTX 2080 Ti - 50% faster than RX 5700 XT at 4K, mainly due to VRAM buffer limit on RX 5700 XT.

Navi 21 has 80 CUs, 256 bit memory bus, 2.2 GHz clock speeds, at the very least

2.2 GHz is 16% above what RX 5700 XT clocked. And RDNA2 GPU are supposed to have higher IPC, than RDNA1.

Solely on CU counts, and VRAM size buffer, Navi 21 should achieve 100% performance above RX 5700 XT.

And that is excluding IPC and clock speed differences.

And yet, AMD demoed a GPU that is 70% faster at 4K than RX 5700 XT.

Something does not add up.
There is no way cu's would scale linearly, doubling the cu's would require a major redesign of the core to optimize feeding all the cu's and shaders, which is a very intricate and tough thing to do. Considering RTX 3080 is about 80% faster than the RX 5700xt, its pretty clear that AMD have achieved similar performance to Nvidia and about 80% scaling with double the cu's, which is quite a reasonable number.