Question Speculation: RDNA2 + CDNA Architectures thread

Page 71 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

uzzi38

Platinum Member
Oct 16, 2019
2,635
5,983
146
All die sizes are within 5mm^2. The poster here has been right on some things in the past afaik, and to his credit was the first to saying 505mm^2 for Navi21, which other people have backed up. Even still though, take the following with a pich of salt.

Navi21 - 505mm^2

Navi22 - 340mm^2

Navi23 - 240mm^2

Source is the following post: https://www.ptt.cc/bbs/PC_Shopping/M.1588075782.A.C1E.html
 

Krteq

Senior member
May 22, 2015
991
671
136
RedGamingTech is peddling that AMD is using some sort of "Breaththrough memory cache system" a.k.a 'Infinity Cache' that does not require more vram bandwidth. i.e. RDNA2 uses a huge cache system, called Infinity Cache, that could be 128 MB (or 64) in size.

Others speculate that this larger cache could make sense because it would be a set up for RDNA3 MCM as well as lessen inter-core performance hits.
With many people expecting RDNA2 to have a large increase in compute power you would think that a wider memory system is needed in order to provide more data to the shaders.

Current Navi today has a 256 bit bus, so max memory bandwidth doesn't look like its increasing much, if at all. Without some other piece of information or technology, this looks like the additional shaders will be underfed.

The larger cache speculation mixed with pixie dust :innocent: is speculated to overcome the limitations of 256bit bus wrapped up all nicely in a package.

Hmm, is he referring to this patent?

ADAPTIVE CACHE RECONFIGURATION VIA CLUSTERING

Abstract
A method of dynamic cache configuration includes determining, for a first clustering configuration, whether a current cache miss rate exceeds a miss rate threshold. The first clustering configuration includes a plurality of graphics processing unit (GPU) compute units clustered into a first plurality of compute unit clusters. The method further includes clustering, based on the current cache miss rate exceeding the miss rate threshold, the plurality of GPU compute units into a second clustering configuration having a second plurality of compute unit clusters fewer than the first plurality of compute unit clusters.
obrzek_2020-09-17_221xmkje.png
That's quite a big change to cache subsytem
 

Glo.

Diamond Member
Apr 25, 2015
5,711
4,559
136
Others speculate that this larger cache could make sense because it would be a set up for RDNA3 MCM as well as lessen inter-core performance hits.
With many people expecting RDNA2 to have a large increase in compute power you would think that a wider memory system is needed in order to provide more data to the shaders.
Where has this been discussed before?
 

DJinPrime

Member
Sep 9, 2020
87
89
51
Wait, so you are trying to say that Kepler -> Maxwell transition on same process (TSMC 28nm) was some kind of miracle?

In the early RDNA 2 design stages, Suzanne Plummer's "Zen team" was invited to RTG and involved in optimization works on future uarchs. You can see their results on semicustom XSX/PS5 SoCs and Renoir Vega GPU power optimizations.

Are you really trying to tell us that AMD is not able to create/design anything new? Well, try to check patents submitted by AMD for last 4 years ;)
Lol, I didn't saying anything like what you're accusing me of saying. I don't know how you get that from anything I said...
 

CastleBravo

Member
Dec 6, 2019
119
271
96
I guess you all are still missing my point. If Navi scales really well, then AMD had no reason to just release the 5700 and 5700xt over a year ago. In development, the engineers should know how well their design scales. Since 5700 was such as small chip, and if scaling was not an issue, then just making a bigger chip would not be that much effort. Why would you limit yourself to 2070 level performance? Do you not want to be known as a high performance company and sell chips? Why wait 1.5 years to make money when you can make money now?

This is just a guess, but I see two potential factors for that decision:

1. They were making huge margins on each wafer of Zen2 dies, and RDNA1 was made on the same process with potentially limited capacity, so building a Big Navi wasn't a sound business decision. If Zen3 is being made on N7+, and RDNA2 on N7P, this might not be the case for the RX 6000 series.

2. NV's (non-super) RTX 2070 ended up being weaker than the smallish Navi 10 die, which may have originally been designed for the 5600 XT, so AMD decided to crank the clocks up and make lots of margin selling them for $400+ as the 5700 XT.
 

Kenmitch

Diamond Member
Oct 10, 1999
8,505
2,249
136
I guess you all are still missing my point. If Navi scales really well, then AMD had no reason to just release the 5700 and 5700xt over a year ago.

Your the one that seems confused....

Unless your intent is to argue it's best to just take the wait and see approach. Trying to convince others that you know how to run AMD better than Lisa Su is.....Laughable to say the least.
 

Timorous

Golden Member
Oct 27, 2008
1,616
2,780
136
I guess you all are still missing my point. If Navi scales really well, then AMD had no reason to just release the 5700 and 5700xt over a year ago. In development, the engineers should know how well their design scales. Since 5700 was such as small chip, and if scaling was not an issue, then just making a bigger chip would not be that much effort. Why would you limit yourself to 2070 level performance? Do you not want to be known as a high performance company and sell chips? Why wait 1.5 years to make money when you can make money now? Things change, like the competition releasing a $700 card that is faster than the current $1200 card. You never sandbag yourself.
I'm a software developer, so I can only explain from that point of view. I have an app that process multiple requests, and things seems to work really well when the number of concurrent request is under 10. Performance wise, it seems to scale really well. But as soon as I am in a scenario where there are 11 concurrent requests, I start seeing issues. I start seeing locking timeouts, data being overwritten by parallel process, memory heap issues, performance issues. So, my program is only scalable upto a certain point. It's a leap to assume big Navi scales the same way as 5500 to 5700.
If Big Navi does end up performing better than the 3080, it's because the engineers had to do a lot of awesome work to get around the problems of RDNA1, just like how I would have to redesign my program to be able to scale above 10.

It is really easy actually. To compete with 2080Ti AMD would have needed to produce a 500mm^2 RDNA GPU. This would not have had Ray Tracing or DLSS but would have competed at native resolution. The max they could sell such a card would be about $1000 and they would have to sell chips for far less than that to board partners so they could make a profit after adding in the PCB and memory.

OTOH for 500mm^2 of N7 wafer space you can fit 6 Zen2 chiplets with space to spare and better yields because 6x75mm^2 yields far better than 1x 500mm2. Using those 6 chiplets to make 3 3950Xs which they can sell for $750 a pop which is $2250. $2250, even with the extra cost of packaging the IO die is far more profit than a single 'Big Navi' RDNA card.

This also applies somewhat to RDNA2 which is why I think going denser like Renoir or GA100 make sense as a 21B transistor chip would be 350mm^2 which is far more palatable if you can compete with the 3080 and sell it for ~$700. If AMD went bigger then that would mean using more wafers to have enough supply reducing the number of Zen 3 chips they can produce.
 

Glo.

Diamond Member
Apr 25, 2015
5,711
4,559
136
I think you diehards on the red side are setting yourself up for disappointment again. A couple of things don't make sense to me with expecting Navi to scale linearly.
1) 5700 was a relatively small chip and performed well and scaled compared to the lower chip. So, why didn't AMD release something bigger?
2) They're already on the 7nm, so it's not even like they're moving up a node and have to do everything new.
3) Does AMD hate money? Why sit on producing a 2080ti competitor all these months (year?) when the price was so high? If they can make a 5800xt that's around the 2080ti, do they think there will be no demand if it's $1000 a year ago? 6 months ago?
4) New versions of Ryzen CPU seems to come out all the time, so it's not like the company have a problem with customers doing frequent upgrades.
The answer to ALL of that above is very simple.

AMD is on a yearly cadence. Designing big GPU with architecture that is going to be retired just 12 months later is uneconomical, from design costs perspective. If you are going to retire your architecture just 12 months later, its better to milk as much as possible cheapest possible designs.

Which means: smallest possible designs.

I hope this ends this ridiculous discussion.
 

Heartbreaker

Diamond Member
Apr 3, 2006
4,228
5,228
136
It is really easy actually. To compete with 2080Ti AMD would have needed to produce a 500mm^2 RDNA GPU. This would not have had Ray Tracing or DLSS but would have competed at native resolution. The max they could sell such a card would be about $1000 and they would have to sell chips for far less than that to board partners so they could make a profit after adding in the PCB and memory.

OTOH for 500mm^2 of N7 wafer space you can fit 6 Zen2 chiplets with space to spare and better yields because 6x75mm^2 yields far better than 1x 500mm2. Using those 6 chiplets to make 3 3950Xs which they can sell for $750 a pop which is $2250. $2250, even with the extra cost of packaging the IO die is far more profit than a single 'Big Navi' RDNA card.

This also applies somewhat to RDNA2 which is why I think going denser like Renoir or GA100 make sense as a 21B transistor chip would be 350mm^2 which is far more palatable if you can compete with the 3080 and sell it for ~$700. If AMD went bigger then that would mean using more wafers to have enough supply reducing the number of Zen 3 chips they can produce.

As before I don't think any of this is correct, because it is only worse today, where they can probably only get $700 Max for the Big Navi GPUs, and new Zen 3 parts are soon going to come online making CPUs again much more wafer profitable.

The real reason is monster chips are low volume and need long runs to recoup costs. A monster chip without Ray Tracing would not have a long run with any kind of premium.

Big Navi had to wait for Ray Tracing an other advanced features to be ready, so the chip can have a longer selling life at premium pricing, to recoup costs.
 

leoneazzurro

Senior member
Jul 26, 2016
930
1,465
136
Quite simply, RDNA1 was limited by the thermals. 1.8GHz at 220W means that for having a card that could beat 2080 or 2080 Super they had to go over 300W, and they anyway had traded blows with the 2080Ti which offered better features with a lower power consumption. So they had sold none, and lost money in developing a big chip that they could have not even used for the professional market as they already decided to use CDNA for that purpose. SO having a "Big Navi" on RDNA1 would have been a big mistake. Now they are claiming +50%perf/W, compatition went for power hungry cards, they added ray tracing, so they can build a chip that in theory can compete.
 

Timorous

Golden Member
Oct 27, 2008
1,616
2,780
136
As before I don't think any of this is correct, because it is only worse today, where they can probably only get $700 Max for the Big Navi GPUs, and new Zen 3 parts are soon going to come online making CPUs again much more wafer profitable.

The real reason is monster chips are low volume and need long runs to recoup costs. A monster chip without Ray Tracing would not have a long run with any kind of premium.

Big Navi had to wait for Ray Tracing an other advanced features to be ready, so the chip can have a longer selling life at premium pricing, to recoup costs.

With Renoir and GA100 at > 60M xtors / mm^2 and also offering good clocks and good power profiles I do not see why an 80CU RDNA 2 part has to be 500mm^2 anymore. If is 500mm^2 and only competes with the 3080 that is a total failure when there were other options available to AMD to be far more wafer efficient with their product stack.
 

uzzi38

Platinum Member
Oct 16, 2019
2,635
5,983
146
$$$ is the reason, 5700 is a small chip, just add 50% more core if things are scaling so wonderfully. It's pretty efficient if the claim about scaling is true. You're making way more money with a bigger chip. And more importantly you gain mind shares and increase stock price. So, again $$$$$$$$$$$$.

I take it you don't know a huge amount about the time, engineering effort and cash required to make a change like that? Sorry, but this is a really poor argument. It's not at all as simple as "just add more cores". You're effectively asking why didn't AMD tape out a larger die on N7, which is no small feat.

Who's talking about uArch improvement? Since everyone seems to shit on RT, what new tech did AMD need to add? Just make it faster, since you have great scale. Just add CUs.
?

1. I've never shit on RTRT itself. I think it's the future. But the key word is future. We're definitely a while away from the tech becoming pervasive. Also, completely unrelated to the conversation.

2. See above on "Just add CUs"

Super was not a waste of time, I'm sure most 2000 series sold are the supers.

My point there was regarding the horrible idea of a SUPER refresh for Ampere but on TSMC N7 instead. I thought that was pretty clear.

You don't think having a 2080 super competitor is valuable? People keep bringing up how well Navi scale, so why was it hard to release a bigger Navi? Cause it doesn't scale as well as you think, or management is really stupid.

Given the time, cost and engineering effort required, no. They made the better choice by focusing more time on RDNA2 instead.

What?? We gone through Ryzen, Ryzen plus, Ryzen 2, and now Ryzen 3 in less than 4 years. Are you kidding? They're not the same die. Each release have some really nice IPC improvements.

And we've also gone through the second generation of Polaris, Vega, (12nm Polaris and 7nm Vega also technically slot in here but w/e) and then RDNA1 in that same timeframe. Each of them made notable improvements in their own ways. Polaris was a notable efficiency bump (2.7x perf/W memes aside), Vega scaled up to 64CUs and brought HBM2, Vega on 7nm was AMD's first die on the node and made decent enough perf/W improvements (about the same as Ampere did over Turing anyway lmao, ever so slightly less) and RDNA1 was a huge change to the uArch as a whole.

Exactly, that means technology wise, it's not as simple as many of the claims here are making. If it was that simple and things wonderfully scale so well, then the only logical conclusion is that their marketing and management is ultra stupid.

I never said it was simple. I said it was happening. :)

The only "ultra stupid" thing is the fact that I even bothered to reply to a comment like this.
 

Mopetar

Diamond Member
Jan 31, 2011
7,842
5,997
136
16 and 12... 24GB out the window on N21

Edit:
N21 w/80CU on 256bus and rumor of large cache with 16GB and rumor of $549 / $599 pricing (think this tells the performance level if true)
N22 possibly w/60CU and 12GB

Pricing rumors are the ones I trust least of all because unlike almost every other decision about a product it's the easiest to change at the last possible moment.

Anyone on a design team knows how many shaders a die will have as soon as the design is sent off to the fab. They might not know exactly how a specific product will cut until later and final cost need not be shared with engineers at all.

I'd like to think most engineers are smart enough to realize that any information they get about pricing could be made up in such a way to try to find leaks as well.
 

Heartbreaker

Diamond Member
Apr 3, 2006
4,228
5,228
136
With Renoir and GA100 at > 60M xtors / mm^2 and also offering good clocks and good power profiles I do not see why an 80CU RDNA 2 part has to be 500mm^2 anymore. If is 500mm^2 and only competes with the 3080 that is a total failure when there were other options available to AMD to be far more wafer efficient with their product stack.

If you want to believe rumors, then it's a ~500mm2 part.

Also density is greatly affected by what you are designing.

The closest thing to a current AMD GPU design is the the new XB Series X. With 42Mt/mm2. Which also leads to a ~500mm2 part.

I suppose you can cherry pick the option that best fits your narrative, but no one knows how this will turn out.

I would say we don't strong evidence either way, but if want an answer right now, I would say the evidence, sketchy as it is, leans more towards ~500 mm2.

Regardless of the exact size. It will still be a big expensive chip, and it has still had the price ceiling lowered with Amperes release, and it's still going to be more profitable to use Wafers for Zen 3. So the exact size doesn't change my argument.

I hold to my original reason: Big Navi needed Ray Tracing and other advanced features.
 
  • Like
Reactions: Konan

Timorous

Golden Member
Oct 27, 2008
1,616
2,780
136
If you want to believe rumors, then it's a ~500mm2 part.

Also density is greatly affected by what you are designing.

The closest thing to a current AMD GPU design is the the new XB Series X. With 42Mt/mm2. Which also leads to a ~500mm2 part.

I suppose you can cherry pick the option that best fits your narrative, but no one knows how this will turn out.

I would say we don't strong evidence either way, but if want an answer right now, I would say the evidence, sketchy as it is, leans more towards ~500 mm2.

Regardless of the exact size. It will still be a big expensive chip, and it has still had the price ceiling lowered with Amperes release, and it's still going to be more profitable to use Wafers for Zen 3. So the exact size doesn't change my argument.

I hold to my original reason: Big Navi needed Ray Tracing and other advanced features.

It is not about cherry picking or rumours or leaks. It is about what is a smart business decision and designing a 3080 competitor using a 350mm^2 die seems far more intelligent and sensible than designing a 3080 competitor around a 500mm^2 die.
 

Heartbreaker

Diamond Member
Apr 3, 2006
4,228
5,228
136
It is not about cherry picking or rumours or leaks. It is about what is a smart business decision and designing a 3080 competitor using a 350mm^2 die seems far more intelligent and sensible than designing a 3080 competitor around a 500mm^2 die.

There is no indication they can actually do that, unless you cherry pick and avoid the most obvious and applicable comparison (XB Series X).
 

Konan

Senior member
Jul 28, 2017
360
291
106
Pricing rumors are the ones I trust least of all because unlike almost every other decision about a product it's the easiest to change at the last possible moment.

Anyone on a design team knows how many shaders a die will have as soon as the design is sent off to the fab. They might not know exactly how a specific product will cut until later and final cost need not be shared with engineers at all.

I'd like to think most engineers are smart enough to realize that any information they get about pricing could be made up in such a way to try to find leaks as well.

In this case the $$ quotes originated from Coreteks Tier 1 partners, so up a level from engineering guesses :)
The video I posted a couple pages back. He details where he gets his info from.

 
  • Like
Reactions: Mopetar

Konan

Senior member
Jul 28, 2017
360
291
106
Coreteks being the guy who said there was "Traversal Co-processor" on the back of Ampere cards.
He did clearly state that was a speculation piece on his part and not tied to any of his sources. It’s a little bit like us debaiting the patent at the top of this page. In this particular case above this post is clearly saying where he got his pricing sources
 

Timorous

Golden Member
Oct 27, 2008
1,616
2,780
136
There is no indication they can actually do that, unless you cherry pick and avoid the most obvious and applicable comparison (XB Series X).

GA100 and Renoir both show TSMC N7 can increase transistor density over what AMD have used so far.

Renoir CPU clocks just as well as Matisse and the GPU clocks higher than any other GPU made by AMD.

Also we do not know the density of PS5 for all we know that could be similar to Renoir.
 
  • Like
Reactions: Tlh97

DJinPrime

Member
Sep 9, 2020
87
89
51
The claim I was questioning was that PC big Navi based on the CU count of Xbox, and based on how well RDNA1 scaled 5500 to 5700 means that PC big Navi will have no problem being more than 60 CUs and performance will scale great at 80 CUs. Some of you guys are even expecting 90% - 100% scaling.

What I wanted to point out was that AMD did not go bigger than 5700 and I can't believe it's a marketing decision, because that's just too stupid. Being successful is all about being agile. If it is true that AMD could have scaled up something bigger than 5700 and decided not to because they're making great margins on 5700 or it's not in their release cycle then no wonder they're behind NV. NV sure have no problem pumping out the super/Ti cards anytime they deemed necessary. Do you want to keep doing the same thing or try to be more like the competition who have a big lead on you? If true, that turned out to be a really bad bet, because now 2080ti level performance only commands a $500 price. And moving the goal post about saying well, it's going to take a lot of resource to design a bigger Navi 10. If that's the case, then your architecture wasn't all that scalable. Just like my software example, I can't call my application scalable if I needed tons of redesign and rework to get things working at higher volume.

As an engineer myself and works in the financial industry, I just can't believe that AMD was holding back. I'm not saying big Navi can't perform great. But I don't think it's logical to assume things will scale well based on RDNA1, because they wouldn't have stop at 5700. If you think that's a good business decision, then we'll just have to agree to disagree. If you think Big Navi will be better than 3080 just because you think so, well I can't argue there. Hope you're right.
 

Timorous

Golden Member
Oct 27, 2008
1,616
2,780
136
The claim I was questioning was that PC big Navi based on the CU count of Xbox, and based on how well RDNA1 scaled 5500 to 5700 means that PC big Navi will have no problem being more than 60 CUs and performance will scale great at 80 CUs. Some of you guys are even expecting 90% - 100% scaling.

What I wanted to point out was that AMD did not go bigger than 5700 and I can't believe it's a marketing decision, because that's just too stupid. Being successful is all about being agile. If it is true that AMD could have scaled up something bigger than 5700 and decided not to because they're making great margins on 5700 or it's not in their release cycle then no wonder they're behind NV. NV sure have no problem pumping out the super/Ti cards anytime they deemed necessary. Do you want to keep doing the same thing or try to be more like the competition who have a big lead on you? If true, that turned out to be a really bad bet, because now 2080ti level performance only commands a $500 price. And moving the goal post about saying well, it's going to take a lot of resource to design a bigger Navi 10. If that's the case, then your architecture wasn't all that scalable. Just like my software example, I can't call my application scalable if I needed tons of redesign and rework to get things working at higher volume.

As an engineer myself and works in the financial industry, I just can't believe that AMD was holding back. I'm not saying big Navi can't perform great. But I don't think it's logical to assume things will scale well based on RDNA1, because they wouldn't have stop at 5700. If you think that's a good business decision, then we'll just have to agree to disagree. If you think Big Navi will be better than 3080 just because you think so, well I can't argue there. Hope you're right.

AMD is only going to sell RDNA dGPUs to consumers. Otoh Renoir and zen2 are sold to OEMs and are used in server deployments.

Proving they can supply the OEMs with enough chips to make producing AMD product lines worthwhile is far more profitable and important long term than competing with the 2080Ti.

Now that they are gaining trust with OEMs and have funding to buy more wafers off of TSMC they are in a position to address the halo tier of the GPU market without impacting the CPU side of the business as much.
 

eek2121

Platinum Member
Aug 2, 2005
2,930
4,026
136
Totally agree.

Dr. Su has Zen 3 coming out first to show how far Ryzen has come. This also buys time for big navi to get some polish. She is building out the stack. Let Nvidia have it's day in the sun. I suspect Big Navi will beat the RTX 3070 but fall a bit short of the RTX 3080 (which as I write this is sold out EVERYWHERE)

I'm looking to stay with AMD to upgrade my RAD VII on my 3900x rig.

I'll wait till RDNA2 drops and see how much difference there is in game play.
Not really. All leaks point to them beating the 3080 easily.
Don't know if this was mentioned, but I saw that the PS5's power supply is rated 350 W, 340 W for the Digital Edition. Series X is 310 W I think.

The PSU for the Xbox Series X has been confirmed to be the same as the Xbox One.

I guess you all are still missing my point. If Navi scales really well, then AMD had no reason to just release the 5700 and 5700xt over a year ago. In development, the engineers should know how well their design scales. Since 5700 was such as small chip, and if scaling was not an issue, then just making a bigger chip would not be that much effort. Why would you limit yourself to 2070 level performance? Do you not want to be known as a high performance company and sell chips? Why wait 1.5 years to make money when you can make money now? Things change, like the competition releasing a $700 card that is faster than the current $1200 card. You never sandbag yourself.
I'm a software developer, so I can only explain from that point of view. I have an app that process multiple requests, and things seems to work really well when the number of concurrent request is under 10. Performance wise, it seems to scale really well. But as soon as I am in a scenario where there are 11 concurrent requests, I start seeing issues. I start seeing locking timeouts, data being overwritten by parallel process, memory heap issues, performance issues. So, my program is only scalable upto a certain point. It's a leap to assume big Navi scales the same way as 5500 to 5700.
If Big Navi does end up performing better than the 3080, it's because the engineers had to do a lot of awesome work to get around the problems of RDNA1, just like how I would have to redesign my program to be able to scale above 10.

AMD didn’t release a bigger GPU for 3 reasons:

1) RTG was broke.
3) The RX 5700 XT already had a high TDP and AMD is done playing that game.
3) Raja was given the boot and RTG was reorganized.

Navi2X is the first real attempt at a first class gaming product in years.
 

SpaceBeer

Senior member
Apr 2, 2016
307
100
116
From Slimbook

Due of the large worldwide success of the AMD 4000 series processors, the manufacturer is unable to keep up with the current demand. Learn more here.

If you buy this product, you will have to wait until the new batch of units gets shipped to us on the 4th quarter of this year. Chances are that it will be October or November at most, but we will update this information with more accurate dates once we they assign us a batch of AMD processors.

So I would say there was/is no enough capacity to make Navi 2x. I think we'll see paper-launch on 28th
 

Kenmitch

Diamond Member
Oct 10, 1999
8,505
2,249
136
What I wanted to point out was that AMD did not go bigger than 5700 and I can't believe it's a marketing decision, because that's just too stupid.

Navi was by design targeting the mid-range performance level. Simple economics as greater sales are to be had in the lower to mid-range vs the higher end market. Putting Big Navi on the back burner doesn't mean they couldn't have made it earlier if it made sense at the time to do so. It was in AMD's best interest to use their resources and wafer supply for their other offerings and commitments at that time.