Question Speculation: RDNA2 + CDNA Architectures thread

Page 97 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

uzzi38

Platinum Member
Oct 16, 2019
2,705
6,427
146
All die sizes are within 5mm^2. The poster here has been right on some things in the past afaik, and to his credit was the first to saying 505mm^2 for Navi21, which other people have backed up. Even still though, take the following with a pich of salt.

Navi21 - 505mm^2

Navi22 - 340mm^2

Navi23 - 240mm^2

Source is the following post: https://www.ptt.cc/bbs/PC_Shopping/M.1588075782.A.C1E.html
 

raghu78

Diamond Member
Aug 23, 2012
4,093
1,475
136
RX 6900XT seems like a beast, 80cu's, 96rops, 2.2GHz boost clock, 16GB 16gbps GDDR6 memory and 256bit interface.

If its 70% faster than the RX 5700xt it would put it at RTX 3080 levels in terms of performance, but consuming less power, being build on a small node and AMD being able to adjust prices very competitively.

I expect if its as fast the the RTX 3080 to be the same price range though.

Navi 21 - 4SE, 8SA, 40 WGP / 80 CU, 128 ROPs, 256 bit GDDR6 or 2048 bit HBM2E, 16 GB. My guess is 2048 bit HBM2E and performance on par with RTX 3090 at 275w.
 

swilli89

Golden Member
Mar 23, 2010
1,558
1,181
136
I know it's just a dumb render in Fortnite, but seems like everyone forgot the rear heatsink anchor layout was highly suggestive of HBM.
I had no idea this was even a thing! I know its senseless to even worry about 256-bit whatever. This is AMD, they've been using HBM for over half a decade and I'm sure they know how to engineer a card with the requisite memory subsystem.

I suppose the real fear surrounding the 256-bit thing would be that AMD is not launching a halo card and is targeting the upper mid-range and below, ala 5700 XT. Fingers crossed this isn't the case.
 

PhoBoChai

Member
Oct 10, 2017
119
389
106
I think all this concern about bus width and bandwidth is premature. AMD hasn't really ever released a GPU that is that badly bandwidth bottlenecked, outside of market segmentation reasons (5600XT).
 

uzzi38

Platinum Member
Oct 16, 2019
2,705
6,427
146
Navi 21 - 4SE, 8SA, 40 WGP / 80 CU, 128 ROPs, 256 bit GDDR6 or 2048 bit HBM2E, 16 GB. My guess is 2048 bit HBM2E and performance on par with RTX 3090 at 275w.

Easily if they use HBM2E. I'm still quite worried on that, everyone's super inconsistent on that :/

Part of me is worries instead of aiming for the crown, they might settle for around 3080 perf with 256 bit bus. Which is still plenty fine and all, but considering this is the best shot they've had in a long time at the crown, I was really hoping they'd take it and go all the way.

I think all this concern about bus width and bandwidth is premature. AMD hasn't really ever released a GPU that is that badly bandwidth bottlenecked, outside of market segmentation reasons (5600XT).

Going to be honest, that was my biggest thought when I saw RDNA2 clocks. Thought there was no way they'd be clocking the GPUs that high if they were seriously bandwidth bound - it simply just didn't make sense. But god is the wait for more info killing me here.
 

Mopetar

Diamond Member
Jan 31, 2011
8,114
6,770
136
I think all this concern about bus width and bandwidth is premature. AMD hasn't really ever released a GPU that is that badly bandwidth bottlenecked, outside of market segmentation reasons (5600XT).

The Polaris cards generally saw better FPS gains from a memory OC than anything else. Part of that was that a significant core clock wasn't possible for most people, but just the fact that better gains were (which tended to be linear for what most cards could manage) found with a memory overclock suggest there was more of a bottleneck there than other parts of the card, but that could have also been as a result of the clocks being pushed as far as they could go.
 

HurleyBird

Platinum Member
Apr 22, 2003
2,760
1,455
136
If they had churned out FEAR RTX I would have been a bit more impressed.

OMG yes. I have no idea how Monolith went backwards, not just in terms of gameplay but arguably along various graphical dimensions, with each FEAR game. And the original FEAR runs like total *** on modern systems too.
 

BlitzWulf

Member
Mar 3, 2016
165
73
101
I know this guy isn't anyone's favorite "source", but its been a slow couple days, so why not?



He seems to be doubling down on the on die cache rumor and claims HBM for consumer cards is a no go .

Also claims that AMD put fake info in the MAC and Linux firmware dumps to throw people off. o_O

m1.png


m2.png


m3.png


m4.png


m5.png
 

TitusTroy

Senior member
Dec 17, 2005
335
40
91
will Big Navi be more gaming focused compared to Ampere?...and will it be better for 1440p compared to Ampere?...Nvidia seemed to overload their new cards with compute performance vs strictly gaming and it seems best served at 4K resolution...

I would love to go with AMD this generation but I bought a 1440p 144hz G-Sync monitor 2 years ago and I don't see myself upgrading anytime soon...
 

DisEnchantment

Golden Member
Mar 3, 2017
1,747
6,598
136
I know this guy isn't anyone's favorite "source", but its been a slow couple days, so why not?



He seems to be doubling down on the on die cache rumor and claims HBM for consumer cards is a no go .

Also claims that AMD put fake info in the MAC and Linux firmware dumps to throw people off. o_O

What a load of trash. Why do you put it here.
Linux 5.9 already sailed. If AMD put fake stuffs there then none of the RDNA2 HW is going to work. What is the point of issuing a drm fix for 5.9 even. Why even bother upstreaming anything.
Yet he heard of "rumors" about 32CU and 40CU, wonder where those came from.
Different CU design for N22 and N23? Sure.
 

PhoBoChai

Member
Oct 10, 2017
119
389
106
will Big Navi be more gaming focused compared to Ampere?...and will it be better for 1440p compared to Ampere?...Nvidia seemed to overload their new cards with compute performance vs strictly gaming and it seems best served at 4K resolution...

I would love to go with AMD this generation but I bought a 1440p 144hz G-Sync monitor 2 years ago and I don't see myself upgrading anytime soon...

GA102 is really good for workstation apps and rendering, where the 2x FP32 shines and without graphics bottlenecks even the RT cores can flex their power as seen in chaos vray, where 3080 with 68 SM is 2x faster than 2080Ti with the same SM count.

So yeah it definitely seems like NV went ballz deep on datacenter & workstation design for Ampere.
 
  • Like
Reactions: Tlh97

TitusTroy

Senior member
Dec 17, 2005
335
40
91
GA102 is really good for workstation apps and rendering, where the 2x FP32 shines and without graphics bottlenecks even the RT cores can flex their power as seen in chaos vray, where 3080 with 68 SM is 2x faster than 2080Ti with the same SM count.

So yeah it definitely seems like NV went ballz deep on datacenter & workstation design for Ampere.

so even the upcoming RTX 3070 won't change anything?...it'll still be heavily focused on compute performance vs gaming?
 

Qwertilot

Golden Member
Nov 28, 2013
1,604
257
126
so even the upcoming RTX 3070 won't change anything?...it'll still be heavily focused on compute performance vs gaming?

We'll see soon enough :) Same architecture of course but it might balance out a bit differently. They'll certainly have to tame the power draws much more aggressively in the mid range chips.

Given how large those various markets are for them now, its easy to see why they might have built out the top gaming chip at least to have a considerable focus on compute stuff.
 

soresu

Diamond Member
Dec 19, 2014
3,230
2,515
136
OMG yes. I have no idea how Monolith went backwards, not just in terms of gameplay but arguably along various graphical dimensions, with each FEAR game. And the original FEAR runs like total *** on modern systems too.
There is an effort started to fix up a source code release of the LithTech Jupiter engine, but sadly I would not expect it to be stable and feature complete anytime soon given the rate of improvement thus far.
 

PhoBoChai

Member
Oct 10, 2017
119
389
106
so even the upcoming RTX 3070 won't change anything?...it'll still be heavily focused on compute performance vs gaming?

I think the 3070 will be more efficient for graphics because it's going to have the 6 GPC layout, which means 6 graphics pipelines where it counts, and while its SM may have or may not, 2x FP32 potential, it at the least has basically the same graphics potential as the 3080 with 6 GPC too, just fewer SMs.
 

beginner99

Diamond Member
Jun 2, 2009
5,233
1,610
136
Linux 5.9 already sailed. If AMD put fake stuffs there then none of the RDNA2 HW is going to work. What is the point of issuing a drm fix for 5.9 even. Why even bother upstreaming anything.

Since I have no clue how this works, can't AMD just fix it later on? i mean what distros actually ship with 5.9 kernel? They must be able to fix it else anyone say on Ubuntu 20.04 (Kernel 5.4) couldn't use a new AMD GPU. So there certainly is a mechanism to fix this later on?
 

andermans

Member
Sep 11, 2020
151
153
76
The new cache is a given at this point, with the unified L1$ design in RDNA 1 opening possibilities for the changes that the dynamic CU cache patent and research paper showed. I don't think it will be a large ESRAM 128mb as rumored. Just larger L1$ and larger L2$ will do the job fine.


I don't think most of the research papers and patents are too relevant for memory bandwidth though. They're mostly about the L0<->L1<->L2 path while for memory bandwidth the main mitigation would be increasing the effective capacity of the last level cache, either by making L2 larger or introducing a new L3. Don't get me wrong, L0<->L1<->L2 improvements could be interesting for performance (especially for raytracing which is pretty hard on the cache hierarchy), but they don't solve the problem everyone seems to be worried about.
 

Glo.

Diamond Member
Apr 25, 2015
5,803
4,777
136
Since I have no clue how this works, can't AMD just fix it later on? i mean what distros actually ship with 5.9 kernel? They must be able to fix it else anyone say on Ubuntu 20.04 (Kernel 5.4) couldn't use a new AMD GPU. So there certainly is a mechanism to fix this later on?
Ubuntu NEVER updates their ISO's. If you have a ISO with Kernel 5.4 - that is what you will get.

You have to wait for next version of the OS, in order to get the desired update, and that is only if you have Ubuntu Shipped with latest kernel. Ubuntu 20.10 will have "only" 5.8 kernel.

Pop!_OS has different approach, despite it being based on the same OS - Ubuntu. Pop always updates their ISO, on their website, with latest hardware support. For example - you couldn't for some time run RX 590 with latest Ubuntu, but if you had latest ISO of Pop!_OS - it worked out of the box.
 

andermans

Member
Sep 11, 2020
151
153
76
Since I have no clue how this works, can't AMD just fix it later on? i mean what distros actually ship with 5.9 kernel? They must be able to fix it else anyone say on Ubuntu 20.04 (Kernel 5.4) couldn't use a new AMD GPU. So there certainly is a mechanism to fix this later on?

For small fixes you can backport to older kernels, but generally Ubuntu updates some components of LTS releases with a HWE package which includes some driver stuff from a new normal release. IIRC the a new kernel is also included in this. Though I guess Ubuntu 20.10 might be getting 5.8?

(As per the poster above I'm not sure those updates get included in the ISO but they should be available as online updates. Worst comes to worst you'll get the basic vesa? support until you update your packages)
 

PhoBoChai

Member
Oct 10, 2017
119
389
106
I don't think most of the research papers and patents are too relevant for memory bandwidth though. They're mostly about the L0<->L1<->L2 path while for memory bandwidth the main mitigation would be increasing the effective capacity of the last level cache, either by making L2 larger or introducing a new L3. Don't get me wrong, L0<->L1<->L2 improvements could be interesting for performance (especially for raytracing which is pretty hard on the cache hierarchy), but they don't solve the problem everyone seems to be worried about.


It's all related.

When you stress L2 less with repetitive L1 spill overs, the L2 is more effective for data and texture cache, so you have to visit memory more often, thereby placing stress on memory bandwidth.

Making L1 and L2 work better, increases their hit rate, reduces their duplication. Cache that hits more and can store more useful data will reduce the need on off-chip memory, so you don't need as much bandwidth for the same perf. aka, better bandwidth efficiency.

Classic example of cache reducing memory bandwidth requirements, first Xbox One, then Xbox One X. 32MB of cache only and it did wonders, like DDR system RAM wonders.
 

Heartbreaker

Diamond Member
Apr 3, 2006
4,340
5,464
136
I know this guy isn't anyone's favorite "source", but its been a slow couple days, so why not?


He seems to be doubling down on the on die cache rumor and claims HBM for consumer cards is a no go .

Also claims that AMD put fake info in the MAC and Linux firmware dumps to throw people off. o_O

View attachment 30722


View attachment 30723


View attachment 30724


View attachment 30726


View attachment 30727

Hate these clickbait channels and refuse to watch them, but you put up slides so I don't have to. So kudos.

Basically I don't see anything unique. He seems to be just re-spinning the current rumor set. But clickbait gotta be generating the clicks...