Question Speculation: RDNA2 + CDNA Architectures thread

Page 36 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

uzzi38

Platinum Member
Oct 16, 2019
2,690
6,345
146
All die sizes are within 5mm^2. The poster here has been right on some things in the past afaik, and to his credit was the first to saying 505mm^2 for Navi21, which other people have backed up. Even still though, take the following with a pich of salt.

Navi21 - 505mm^2

Navi22 - 340mm^2

Navi23 - 240mm^2

Source is the following post: https://www.ptt.cc/bbs/PC_Shopping/M.1588075782.A.C1E.html
 

Konan

Senior member
Jul 28, 2017
360
291
106
Nobody's doubting that SS is cheaper, but they are questioning your 30% cheaper and wondering why you would even make up a number if you don't know.

You are right I do not know. It is speculation that has been read. I’ve even read up to half the cost. So for clarification, that is why I said apparently but I could’ve added in unproven assumptions
 

kurosaki

Senior member
Feb 7, 2019
258
250
86
The way I see it is that we are only going to see RT in the vast majority of games that is portable to the consoles.

The Series X and PS5 will set the minimum performance threshold for turning the RT toggle on at 4K and 30 fps minimum.

As long as the midrange RDNA2 PC GPUs have the same or more ability to accelerate RT then I can't see owners of those GPUs really missing out on too much?

BUT :) Out of curiosity, what does your math work out to and how did you plug in the numbers?
2-4 RT ops per CU per cycle, according to the Microsoft picture. Low count for the 80 CU big Navi is then 80 x 2 = 160 ops per clockcycle, times ~2ghz would give us 320 Gigarays per second. Nvidia mentioned their RT perf. Yesterday as 56 TF in RT performance, almost double the cuda perf of 30 TF. Hope AMDs RT-cores has some really magic fairy dust in them..
 

MrTeal

Diamond Member
Dec 7, 2003
3,584
1,743
136
I'm going with it's all part of Lisa's plan.

Allow Microsoft and Sony to tease console features, but not pricing and availability.

Let Nvidia show it's cards and pricing with availability.

Allow Sony and Microsoft to reveal pricing and start pre-orders before Nvidia's products are available.
While I am by no means privy to the negotiations between AMD and MS/Sony, that seems to be giving a huge amount of clout to Lisa Su that I am not sure she has. Sony and MS's lack of pricing announcements is almost certainly about their competition with each other as opposed to them helping their chip supplier compete better in the desktop GPU market.
 

eek2121

Diamond Member
Aug 2, 2005
3,027
4,213
136
Exactly.

Why has there been no console pricing revealed?
Why has there been no console pre-orders?
Why no leaks about Big Navi?
Why no leaks about Zen 3?

I'm going with it's all part of Lisa's plan.

Allow Microsoft and Sony to tease console features, but not pricing and availability.

Let Nvidia show it's cards and pricing with availability.

Allow Sony and Microsoft to reveal pricing and start pre-orders before Nvidia's products are available.

Unleash some sponsored Big Navi rumors with teased performance just as Nvidia's products start to become available.

Announce Zen 3 with more Big Navi performance teasing.

Announce Big Navi pricing and availability w/bundle deals when purchasing a Zen 3 processor.

Seems like a good plan at least as monies gonna be tight for a lot of people this holiday season.

Sounds like a sane approach to at least make a dent and stay in the internet chatter zone for a lot longer.

If mining turns out to be good on Ampere I think I'd do some pump and dumping of coins to entice the miners into buying Nvidia's card....Gotta give those silly guys who'm want AMD to compete for better deals on Nvidia products the finger somehow!
AMD has very little control over the pricing and availability of consoles.
Nobody's doubting that SS is cheaper, but they are questioning your 30% cheaper and wondering why you would even make up a number if you don't know.
I am doubting it. AMD likely gets a significant discount from TSMC.
The way I see it is that we are only going to see RT in the vast majority of games that is portable to the consoles.

The Series X and PS5 will set the minimum performance threshold for turning the RT toggle on at 4K and 30 fps minimum.

As long as the midrange RDNA2 PC GPUs have the same or more ability to accelerate RT then I can't see owners of those GPUs really missing out on too much?

BUT :) Out of curiosity, what does your math work out to and how did you plug in the numbers?
He guessed, and he is wrong.

Need I remind everyone that RDNA2 will be on a different 7nm process from the consoles?
 
  • Like
Reactions: lightmanek

eek2121

Diamond Member
Aug 2, 2005
3,027
4,213
136
2-4 RT ops per CU per cycle, according to the Microsoft picture. Low count for the 80 CU big Navi is then 80 x 2 = 160 ops per clockcycle, times ~2ghz would give us 320 Gigarays per second. Nvidia mentioned their RT perf. Yesterday as 56 TF in RT performance, almost double the cuda perf of 30 TF. Hope AMDs RT-cores has some really magic fairy dust in them..

Also can’t be compared to consoles. They could double up on RT performance since they have a much higher TDP to work with (not to mention memory bandwidth).
 

kurosaki

Senior member
Feb 7, 2019
258
250
86
Also can’t be compared to consoles. They could double up on RT performance since they have a much higher TDP to work with (not to mention memory bandwidth).
Double up? 640 Gigarays? Still far from 56 Terra rays.. yes, something must be off from the picture. Hope Lisa comes with a tease soon.
 

Stuka87

Diamond Member
Dec 10, 2010
6,240
2,559
136
Double up? 640 Gigarays? Still far from 56 Terra rays.. yes, something must be off from the picture. Hope Lisa comes with a tease soon.

This has been gone through before back when Turing came out. The way that nVidia measures a ray for that calculation has no basis in reality and how RT is used in games. So you are best off ignoring it and just go off in game performance.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,688
1,222
136
Isn't it;
Pascal: ~1 gigaray/s
Turing: ~10 gigaray/s
Ampere: ~20 gigaray/s

52 CUs of RDNA2(Anaconda/Arden(17h 80h-8Fh APU)) => 380 Gigaray-box/s or 95 Gigaray-triangle/s

With Xbox Series S(Lockhart/Mero(17h 98h-9Fh APU)) => 124 Gigaray-box/s or 31 Gigaray-triangle/s
 

eek2121

Diamond Member
Aug 2, 2005
3,027
4,213
136
The two architectures are not simply comparable. And Tera Rays? Nvidia would be really happy if their architecture is capable of Tera rays, they only said that "they have way more teraflops dedicated to ray tracing".

This. Also, based on what I have seen thus far, RT performance is around 70% faster than turing.

The only actual negative AMD is likely to have is the lack of something similar to DLSS, although they may have a solution that hasn’t been shown yet.
 

Kenmitch

Diamond Member
Oct 10, 1999
8,505
2,249
136
While I am by no means privy to the negotiations between AMD and MS/Sony, that seems to be giving a huge amount of clout to Lisa Su that I am not sure she has. Sony and MS's lack of pricing announcements is almost certainly about their competition with each other as opposed to them helping their chip supplier compete better in the desktop GPU market.

I was making pun at all the silly guys/gals spreading doom and gloom at AMD. They know who they are!

Nobody know what's up currently so my speculation is just as good as the next guys/gals....And the bots!

I guess we'll see how it pans out soon enough.
 
  • Like
Reactions: spursindonesia

kurosaki

Senior member
Feb 7, 2019
258
250
86
Isn't it;
Pascal: ~1 gigaray/s
Turing: ~10 gigaray/s
Ampere: ~20 gigaray/s

52 CUs of RDNA2(Anaconda/Arden(17h 80h-8Fh APU)) => 380 Gigaray-box/s or 95 Gigaray-triangle/s

With Xbox Series S(Lockhart/Mero(17h 98h-9Fh APU)) => 124 Gigaray-box/s or 31 Gigaray-triangle/s
Was just comparing to this:

Screenshot_20200902-221823.png
 

Hitman928

Diamond Member
Apr 15, 2012
5,526
8,579
136
Was just comparing to this:

View attachment 29100

If you notice, the same row has a question mark for the 2080 Ti. So how is Ray Performance FLOPs measured or calculated? I honestly have no idea and it seems Ryan doesn't know either. It is certainly not rays/s and seems to be some Nvidia unique calculation that they invented as a way to have a quantitative yet ambiguous way of comparing Ampere chips.
 

lobz

Platinum Member
Feb 10, 2017
2,057
2,856
136
View attachment 29097
2-4 RT cores per CU x ~2ghz is far from Nvidias claimed 50+TF in RT performance (320GF). By a magnitude, how is rdna2 and big Navi going to have a chance here? Starting to get a bit worried, or hopefully I have missed something in my calculations.
You're correct.

320 GigaFists (320 Billion fists - probably for hitting someone) is indeed a higher number than 2-4 sleeping pills, however, both can get you to sleep.

How do you know anything about how AMD's hardware unit compares to whatever number NVIDIA decides to conjure up? I mean if you know from somewhere that it's by a magnitude lower / weaker, I am genuinely curious. If you could provide me a link or just the part of your thought process where you were able to actually quantify both vendor's metrics, that'd be appreciated.
 

kurosaki

Senior member
Feb 7, 2019
258
250
86
I'm confused: how do they translate rays to flops?
Ok, Rays is my expression, Floating Point Operations per Second is quite universal. If AMD RT capability is 2-4 RT ops per CU, , then we have a theoretical number in flops. 320-640 floating point operations per seconds RT-wise. Nvidia claims to have double as many RT resources as Cuda cores by just looking at the official information.

But. By looking at that picture again it made me wonder. Bare with me guys! :D

It also tells us the Xbox xsxsxs have 256, 32 bit fp/s per CU, which translates to 256*80*2(ghz)= 40,9 TF in pure shader perf. Following that lead we can see it issues 7 instructions per clock/ CU. The RT unit mentioned was doing up to 4, but we need some texturing as well aye?! So let's say 2 to count conservatively. that leaves us with ~0,4 x shader performance. Oooor, 16 TF of RT performance.

Now as you have pointed out, there is no way we can compare apples to apples here, but the funny thing is im starting to see why Nvidia is going all out with everything this gen. :)
 
Last edited:

uzzi38

Platinum Member
Oct 16, 2019
2,690
6,345
146
Of course Samsung is cheaper per wafer, but have you taken into account yields, die sizes and additonal costs in terms of memory selection and board complexity?

Comparing TSMC and SS for Mvidia's usecase is much more simple. The first two alongside price per wafer is all that matters.

Comparing costs for AMD vs Nvidia? Whole different story.

Without knowing several things, such as the margins both are making, the difference in deals they're getting, 8LPP yields with such large dies, you name it, there's not a chance in hell we can make a decision on which is cheaper to produce.
 

kurosaki

Senior member
Feb 7, 2019
258
250
86
You're correct.

320 GigaFists (320 Billion fists - probably for hitting someone) is indeed a higher number than 2-4 sleeping pills, however, both can get you to sleep.

How do you know anything about how AMD's hardware unit compares to whatever number NVIDIA decides to conjure up? I mean if you know from somewhere that it's by a magnitude lower / weaker, I am genuinely curious. If you could provide me a link or just the part of your thought process where you were able to actually quantify both vendor's metrics, that'd be appreciated.
I'm just doing my best to interpret the official documents listing official specs. That's all. A floating point operation is a floating point operation is a... With that said, we could have great differences in efficiency. An Nvidia flop could easily translate into 0,7 AMD-flops, or vice versa in real gaming scenarios due to a lot of factors. Do you remember the release of RDNA1? The amount of TF actually was lower than previous gen of arch, GCN, but still performed much better in gaming. RDNA is a highly developed platform for image rendering, it is a good thing.
I'm just a layman Anandtech member pondering about in lack of real leaks. But honestly, any one with further thoughts? : )
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,848
3,387
136
Ok, Rays is my expression, Floating Point Operations per Second is quite universal. If AMD RT capability is 2-4 RT ops per CU, , then we have a theoretical number in flops. 320-640 floating point operations per seconds RT-wise. Nvidia claims to have double as many RT resources as Cuda cores by just looking at the official information.

But. By looking at that picture again it made me wonder. Bare with me guys! :D

It also tells us the Xbox xsxsxs have 256, 32 bit fp/s per CU, which translates to 256*80*2(ghz)= 40,9 TF i pure shader perf. Following that lead we can see it issues 7 instructions per clock/ CU. The RT unit mentioned was doing up to 4, but we need some texturing as well aye?! So let's say 2 to count conservatively. that leaves us with ~0,4 x shader performance. Oooor, 16 TF of shader performance.

Now as you have pointed out, there is no way we can compare apples to apples here, but the funny thing is im starting to see why Nvidia is going all out with everything this gen. :)

If the Ray tracing unit only did 1 simple FP op a cycle there would be no point in it , you would just do those calculations on an ALU. So 1 RT op dispatched from the scheduler will have X amount of Flops , and X is probably variable depending complexity and test type.