Question Speculation: RDNA2 + CDNA Architectures thread

Page 206 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

uzzi38

Platinum Member
Oct 16, 2019
2,565
5,568
146
All die sizes are within 5mm^2. The poster here has been right on some things in the past afaik, and to his credit was the first to saying 505mm^2 for Navi21, which other people have backed up. Even still though, take the following with a pich of salt.

Navi21 - 505mm^2

Navi22 - 340mm^2

Navi23 - 240mm^2

Source is the following post: https://www.ptt.cc/bbs/PC_Shopping/M.1588075782.A.C1E.html
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,329
2,811
106
130W TBP? The reference card + cooler is pretty small, so It's not out of question.
I have to wonder about performance.
RX5500XT StriX 8GB has 22CU, 32Rops, 128bit GDDR6 and 1834Mhz on average and that's 5.2TFlops.
Navi23 should have 32CU, 32-64Rops?, 64MB IC?, 128bit GDDR6 and with 2200Mhz on average It would mean 9TFlops or 73% higher.
I think It should perform somewhere between RX5600XT - RX5700.
With a reasonable price($199-239) It could be a very good card for 1080p.
On the other hand If this chip is really ~240mm2 and even with higher density while Navi10 is 158mm2, then I must say I am not impressed and once more I have to question If Infinity cache is really worth the size It occupies on a GPU.
 
Last edited:

Glo.

Diamond Member
Apr 25, 2015
5,657
4,407
136
130W TBP? The reference card + cooler is pretty small, so It's not out of question.
I have to wonder about performance.
RX5500XT StriX 8GB has 22CU, 32Rops, 128bit GDDR6 and 1834Mhz on average and that's 5.2TFlops.
Navi23 should have 32CU, 32-64Rops?, 64MB IC?, 128bit GDDR6 and with 2200Mhz on average It would mean 9TFlops or 73% higher.
I think It should perform somewhere between RX5600XT - RX5700.
With a reasonable price($199-239) It could be a very good card for 1080p.
On the other hand If this chip is really ~240mm2 and even with higher density while Navi10 is 158mm2, then I must say I am not impressed and once more I have to question If Infinity cache is really worth the size It occupies on a GPU.
Actually, target it a little higher, than just Navi 10 ;).

It should be between RTX 2070 and 2070 Super.

You understand why this is by far the most exciting GPU of them all, and why AMD engineers called this and "Nvidia Killer"? ;)
 
  • Like
Reactions: lightmanek

TESKATLIPOKA

Platinum Member
May 1, 2020
2,329
2,811
106
Faster than RX 5700XT with only 130W TBP? I am skeptical, ~2.3-2.4GHz would be needed for that performance If not more and that's too high clock for only 130W TBP in my opinion when full N21 is rated at 300W.
If It's really true, then N22 would be only 20% faster at best while being a lot bigger and having a lot higher TBP, that doesn't look so good for N22.
BTW when will they finally present N22 and N23 to the public?
 
Last edited:

Glo.

Diamond Member
Apr 25, 2015
5,657
4,407
136
As I have said, 40 CU die is 10-15% faster than RTX 2080 Super.

When will they show those GPUs? I don't know but I wouldn't expect anything before the end of Q1. That's my opinion only.
 

Gideon

Golden Member
Nov 27, 2007
1,608
3,564
136
A few games use it very well, but others don't. I played through Control and it makes a big difference there. DLSS is needed with RT, and even with a 3090 I had to play at 1080p to get 80+fps consistently, with 2560x1440 running at 45-60fps. The image looks soft and has sampling noise as mentioned earlier, but much nicer than playing at 4K without RT. I didn't care about RT/DLSS when buying the card but now think it's an important feature even today.
Control is the game I really wish would implement AMD's analog to DLSS (I hope they do as they have a next-gen console version incoming with RT support):

Currently on my 6800 I get around 30FPS in 1440p with RT. I'd consider ~50FPS totally playable (and reach that with 1080p upscaling to 1440p) but while the game engine upscaliing is semi-decent it's nowhere near DLSS level.
 

biostud

Lifer
Feb 27, 2003
18,193
4,674
136
What is the reason they don't keep the 256-bit bus coupled with 8Gb memory for the midrange boards? Wouldn't it give better performance for 1080p cards in most situations, instead of 192-bit + 12Gb?
 

uzzi38

Platinum Member
Oct 16, 2019
2,565
5,568
146
What is the reason they don't keep the 256-bit bus coupled with 8Gb memory for the midrange boards? Wouldn't it give better performance for 1080p cards in most situations, instead of 192-bit + 12Gb?
Well that's assuming that N22 requires additional memory bandwidth. Given that they have half the CU count but 3/4s the GDDR6 memory bandwidth (IC amount is still not confirmed) I would hazard a guess and say they probably don't.
 
  • Like
Reactions: Mopetar and Glo.

Gideon

Golden Member
Nov 27, 2007
1,608
3,564
136
What is the reason they don't keep the 256-bit bus coupled with 8Gb memory for the midrange boards? Wouldn't it give better performance for 1080p cards in most situations, instead of 192-bit + 12Gb?
They probably want 12GB of memory. and to offset die size increases caused by infinity cache by having a simpler memory controller
 

Krteq

Senior member
May 22, 2015
991
671
136
Interesting dual-GPU appeared in AOtS benchmark DB

 

Mopetar

Diamond Member
Jan 31, 2011
7,797
5,899
136
What is the reason they don't keep the 256-bit bus coupled with 8Gb memory for the midrange boards? Wouldn't it give better performance for 1080p cards in most situations, instead of 192-bit + 12Gb?

Cost and product differentiation. Additional MCs and cache take up more die space, which increases production costs. The extra VRAM capacity (16 vs. 12 GB) won't matter at 1080p and (most likely) 1440p, and it seems doubtful that bandwidth or cache size would be a bottleneck at 1080p either.

I'm not even sure that these are really intended to target the 1080p market since they'll still likely be overkill. Instead I view these as 1440p cards that will give acceptable performance or cards for someone who wants to run in 1080p and have frame rates that are going to push up against or even exceed the limits of their monitor. Of course big Navi does that even better, but the cost and availability may put some people off.

Navi 23 may likely be the cards that are aimed at 1080p gamers who want good performance at an entry-level price. That'll have up to 75% of the CUs as Navi 22 and it's expected that it will have 66% of the Infinity Cache, so the performance drop won't be too drastic.
 

uzzi38

Platinum Member
Oct 16, 2019
2,565
5,568
146
Interesting dual-GPU appeared in AOtS benchmark DB

I won't say any more than it's not dual-GPU nor is it anything RDNA3 related.

EDIT: I'm an idiot, I thought MCM when I saw dual-GPU because of what VCZ wrote about what wjm said.

What I mean to say is wjm is wrong on this.
 
Last edited:

biostud

Lifer
Feb 27, 2003
18,193
4,674
136
Cost and product differentiation. Additional MCs and cache take up more die space, which increases production costs. The extra VRAM capacity (16 vs. 12 GB) won't matter at 1080p and (most likely) 1440p, and it seems doubtful that bandwidth or cache size would be a bottleneck at 1080p either.

I suggested they cut the memory to 8Gb, saving 4gb to reduce cost. But then at a higher bandwidth.
 

Mopetar

Diamond Member
Jan 31, 2011
7,797
5,899
136
I suggested they cut the memory to 8Gb, saving 4gb to reduce cost. But then at a higher bandwidth.

The only two ways to increase bandwidth in a traditional sense are to run a wider memory bus which increases die size (and cost) or to use memory chips which clock higher, and also cost more.

Where do the cost savings come from here? The 8 GB cards will have a smaller bus, but would need a 50% clock boost just to have the same bandwidth as the cards with the wider bus and 12 GB of VRAM. There isn't any VRAM with that much headroom to tap into. Even the GDDR6X that Nvidia is using in their top-end cards isn't enough of a performance boost to make that feasible.

The only way your solution works out is if the card actually increased the size of the bus, but used lower capacity (1 GB vs 2 GB) memory chips. But once again, that increases the die size and everyone would wonder why the card is only being sold with 8 GB of memory instead of 16 GB.
 

biostud

Lifer
Feb 27, 2003
18,193
4,674
136
The only two ways to increase bandwidth in a traditional sense are to run a wider memory bus which increases die size (and cost) or to use memory chips which clock higher, and also cost more.

Where do the cost savings come from here? The 8 GB cards will have a smaller bus, but would need a 50% clock boost just to have the same bandwidth as the cards with the wider bus and 12 GB of VRAM. There isn't any VRAM with that much headroom to tap into. Even the GDDR6X that Nvidia is using in their top-end cards isn't enough of a performance boost to make that feasible.

The only way your solution works out is if the card actually increased the size of the bus, but used lower capacity (1 GB vs 2 GB) memory chips. But once again, that increases the die size and everyone would wonder why the card is only being sold with 8 GB of memory instead of 16 GB.
The 8gb could run on a 256bit bus just like the 6800 cards, and if they wanted to save a little more they could opt for some slower gddr6. No one would wonder why the midrange cards has less memory than the high end, it is obviously to differentiate the tiers and to cut costs.

I'm just wondering if 12Gb 192bit vs 8gb 256bit would give most performance/$ in the midrange.
 
Last edited:

Mopetar

Diamond Member
Jan 31, 2011
7,797
5,899
136
The midrange cards have half the CUs, so they don't need the same bandwidth considering they still have a lot of Infinity Cache to go along with the 192-bit bus. I don't really know to what degree the amount of Infinity Cache is tied to the number of memory controllers, but if it is then they'd wind up with 128 MB of that as well. The extra memory controllers and cache is going to cost a lot more than you'd save using 6x 2 GB memory chips over 8x 1 GB chips.
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,329
2,811
106
For 1080p even 32 MB IC would have >50% hit rate, just a few % less than what N21 has with 128 MB at 4K.
I personally think N23 will have 64 MB IC, which would provide ~70% hit rate at 1080p.
 
  • Like
Reactions: Mopetar

biostud

Lifer
Feb 27, 2003
18,193
4,674
136
The midrange cards have half the CUs, so they don't need the same bandwidth considering they still have a lot of Infinity Cache to go along with the 192-bit bus. I don't really know to what degree the amount of Infinity Cache is tied to the number of memory controllers, but if it is then they'd wind up with 128 MB of that as well. The extra memory controllers and cache is going to cost a lot more than you'd save using 6x 2 GB memory chips over 8x 1 GB chips.

I don't think the cache size is related to memory controllers. I think they could reduce it to 64Mb and still have 256 bit memory controller. But I also think the engineers at AMD probably has a better understanding of what will give the best results, compared to my speculations :p
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,329
2,811
106
I don't think the cache size is related to memory controllers. I think they could reduce it to 64Mb and still have 256 bit memory controller. But I also think the engineers at AMD probably has a better understanding of what will give the best results, compared to my speculations :p
If N21 with 80CU has only 128MB IC and 256bit memory controller then N22 with only 40CU and 64MB IC doesn't need 256bit, even 128bit should be good enough.
 

Mopetar

Diamond Member
Jan 31, 2011
7,797
5,899
136
AMD did show off information about the hit rate of infinity cache at various resolutions, so we can at least use what they showed us as the basis for an argument even if there are some cases that fall outside of the typical results.

graph.png

I added a few lines just to make it a little easier to compare the resolutions. The bottom line is the intersection of the 4K curve and 128 MB of infinity cache. The top line is the intersection of the 1440p curve and 96 MB of infinity cache.

96 MB of Infinity Cache has a better hit rate at 1440p than 128 MB does at 4K, so even with 40 CU, the Navi 22 cards should perform reasonably well at 1440p. TPU puts the 5700 XT at 87 FPS average across 22 games in their 1440p tests, so that's a good baseline for where the 6700 XT should be at. Even without having the Infinity Cache, just using faster GDDR6 memory like they have with Navi 21 would alone result in 85% of the bandwidth as Navi 10 despite only having 75% of the bus width. The 64 MB of infinity cache in Navi 23 should have a better hit rate than the bigger cards at their respective resolutions, but the graph does show reasonable growth up to about 64 MB which is where it starts to taper off.

I don't know if that necessarily makes the setup they've used overkill for Navi 22 though. Navi 10 did have a 256-bit memory bus so it's obvious that AMD needs enough infinity cache to compensate for that. If they wanted to do it through memory clock speed alone, they'd need VRAM that's clocked 33% faster than what the 5700 XT uses. Navi 21 is using faster memory, but it's only about 15% faster so not enough to close that gap alone. AMD could also be stuck using the older, slower VRAM that Navi 10 used simply due to supply constraints as well, but in either case they need something to pick up a little bit of the slack.

The additional capacity is likely as a result of consoles moving to 16 GB of available memory. Obviously they split that between the CPU and GPU, but 10 - 12 GB is going to become the new norm over time. If someone buys one of these cards with the intention of holding on to it for five years, I suspect that's when we'll see a lot of titles where 8 GB isn't good enough, particularly at resolutions above 1080p. If you think of Navi 22 as a 1080p card, then yes the extra 32 MB of infinity cache doesn't get you much compared to what you can get with only 64 MB, but these are going to be positioned as 1440p cards and I think that if the clock speeds wind up being as good as they were with Navi 21, that they could also serve as an entry-level 4K card in much the same way that the 3060 Ti can pull an acceptable average frame rate in many titles at that resolution.
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,329
2,811
106
I don't know if that necessarily makes the setup they've used overkill for Navi 22 though. Navi 10 did have a 256-bit memory bus so it's obvious that AMD needs enough infinity cache to compensate for that. If they wanted to do it through memory clock speed alone, they'd need VRAM that's clocked 33% faster than what the 5700 XT uses. Navi 21 is using faster memory, but it's only about 15% faster so not enough to close that gap alone. AMD could also be stuck using the older, slower VRAM that Navi 10 used simply due to supply constraints as well, but in either case they need something to pick up a little bit of the slack.

The additional capacity is likely as a result of consoles moving to 16 GB of available memory. Obviously they split that between the CPU and GPU, but 10 - 12 GB is going to become the new norm over time. If someone buys one of these cards with the intention of holding on to it for five years, I suspect that's when we'll see a lot of titles where 8 GB isn't good enough, particularly at resolutions above 1080p. If you think of Navi 22 as a 1080p card, then yes the extra 32 MB of infinity cache doesn't get you much compared to what you can get with only 64 MB, but these are going to be positioned as 1440p cards and I think that if the clock speeds wind up being as good as they were with Navi 21, that they could also serve as an entry-level 4K card in much the same way that the 3060 Ti can pull an acceptable average frame rate in many titles at that resolution.
I don't understand why do you compare Navi 22 against Navi 10 when we have Navi 21.
Full Navi 21 has 2x as much CU at higher clockspeed than Navi 10, yet It has the same bus width of 256bit, the only difference is faster 16GHz GDDR6 instead of 14GHz and 128MB IC.
On the other hand Navi 22 has 1/2 of CU and 3/4 of bandwidth(192bit 16GHz) and IC(96MB).
So either N22 has an overkill setup even If they use only 14GHz GDDR6 or 256bit + 128MB is not enough for N21, which is unlikely considering It is fastest at 4K against navi 10.
It will be interesting comparing N23(32CU, 64ROPs, 64MB IC, 128bit GDDR6) against N22(40CU, 96ROPs, 96MB IC, 192bit GDDR6).
 
Last edited:

menhera

Junior Member
Dec 10, 2020
21
66
61
I don't understand why do you compare Navi 22 against Navi 10 when we have Navi 21.
Full Navi 21 has 2x as much CU at higher clockspeed than Navi 10, yet It has the same bus width of 256bit, the only difference is faster 16GHz GDDR6 instead of 14GHz and 128MB IC.
On the other hand Navi 22 has 1/2 of CU and 3/4 of bandwidth(192bit 16GHz) and IC(96MB).
So either N22 has an overkill setup even If they use only 14GHz GDDR6 or 256bit + 128MB is not enough for N21, which is unlikely considering It is fastest at 4K against navi 10.
It will be interesting comparing N23(32CU, 64ROPs, 64MB IC, 128bit GDDR6) against N22(40CU, 96ROPs, 96MB IC, 192bit GDDR6).
The 6700 XT is expected to boost higher than Navi 21. Even an AMD slide shows performance per clk starts to slow down from 2200MHz. Imo 6700 XT absolutely needs a 192-bit bus + 96MB L3 cache if its boost clock is really 2500MHz. It'll be just as good as 3060 Ti at best though.
 

Attachments

  • 61-1080.1e39297f.png
    61-1080.1e39297f.png
    814.7 KB · Views: 24