Question Speculation: RDNA2 + CDNA Architectures thread

uzzi38 · Apr 28, 2020

All die sizes are within 5mm^2. The poster here has been right on some things in the past afaik, and to his credit was the first to saying 505mm^2 for Navi21, which other people have backed up. Even still though, take the following with a pich of salt.

Navi21 - 505mm^2

Navi22 - 340mm^2

Navi23 - 240mm^2

Source is the following post: https://www.ptt.cc/bbs/PC_Shopping/M.1588075782.A.C1E.html

Konan · Sep 2, 2020

maddie said:
Nobody's doubting that SS is cheaper, but they are questioning your 30% cheaper and wondering why you would even make up a number if you don't know.

You are right I do not know. It is speculation that has been read. I’ve even read up to half the cost. So for clarification, that is why I said apparently but I could’ve added in unproven assumptions

kurosaki · Sep 2, 2020

blckgrffn said:
The way I see it is that we are only going to see RT in the vast majority of games that is portable to the consoles.

The Series X and PS5 will set the minimum performance threshold for turning the RT toggle on at 4K and 30 fps minimum.

As long as the midrange RDNA2 PC GPUs have the same or more ability to accelerate RT then I can't see owners of those GPUs really missing out on too much?

BUT Out of curiosity, what does your math work out to and how did you plug in the numbers?

2-4 RT ops per CU per cycle, according to the Microsoft picture. Low count for the 80 CU big Navi is then 80 x 2 = 160 ops per clockcycle, times ~2ghz would give us 320 Gigarays per second. Nvidia mentioned their RT perf. Yesterday as 56 TF in RT performance, almost double the cuda perf of 30 TF. Hope AMDs RT-cores has some really magic fairy dust in them..

MrTeal · Sep 2, 2020

Kenmitch said:
I'm going with it's all part of Lisa's plan.

Allow Microsoft and Sony to tease console features, but not pricing and availability.

Let Nvidia show it's cards and pricing with availability.

Allow Sony and Microsoft to reveal pricing and start pre-orders before Nvidia's products are available.

While I am by no means privy to the negotiations between AMD and MS/Sony, that seems to be giving a huge amount of clout to Lisa Su that I am not sure she has. Sony and MS's lack of pricing announcements is almost certainly about their competition with each other as opposed to them helping their chip supplier compete better in the desktop GPU market.

eek2121 · Sep 2, 2020

Kenmitch said:
Exactly.

Why has there been no console pricing revealed?
Why has there been no console pre-orders?
Why no leaks about Big Navi?
Why no leaks about Zen 3?

I'm going with it's all part of Lisa's plan.

Allow Microsoft and Sony to tease console features, but not pricing and availability.

Let Nvidia show it's cards and pricing with availability.

Allow Sony and Microsoft to reveal pricing and start pre-orders before Nvidia's products are available.

Unleash some sponsored Big Navi rumors with teased performance just as Nvidia's products start to become available.

Announce Zen 3 with more Big Navi performance teasing.

Announce Big Navi pricing and availability w/bundle deals when purchasing a Zen 3 processor.

Seems like a good plan at least as monies gonna be tight for a lot of people this holiday season.

Sounds like a sane approach to at least make a dent and stay in the internet chatter zone for a lot longer.

If mining turns out to be good on Ampere I think I'd do some pump and dumping of coins to entice the miners into buying Nvidia's card....Gotta give those silly guys who'm want AMD to compete for better deals on Nvidia products the finger somehow!

AMD has very little control over the pricing and availability of consoles.

maddie said:
Nobody's doubting that SS is cheaper, but they are questioning your 30% cheaper and wondering why you would even make up a number if you don't know.

I am doubting it. AMD likely gets a significant discount from TSMC.

blckgrffn said:
The way I see it is that we are only going to see RT in the vast majority of games that is portable to the consoles.

The Series X and PS5 will set the minimum performance threshold for turning the RT toggle on at 4K and 30 fps minimum.

As long as the midrange RDNA2 PC GPUs have the same or more ability to accelerate RT then I can't see owners of those GPUs really missing out on too much?

BUT Out of curiosity, what does your math work out to and how did you plug in the numbers?

He guessed, and he is wrong.

Need I remind everyone that RDNA2 will be on a different 7nm process from the consoles?

eek2121 · Sep 2, 2020

kurosaki said:
2-4 RT ops per CU per cycle, according to the Microsoft picture. Low count for the 80 CU big Navi is then 80 x 2 = 160 ops per clockcycle, times ~2ghz would give us 320 Gigarays per second. Nvidia mentioned their RT perf. Yesterday as 56 TF in RT performance, almost double the cuda perf of 30 TF. Hope AMDs RT-cores has some really magic fairy dust in them..

Also can’t be compared to consoles. They could double up on RT performance since they have a much higher TDP to work with (not to mention memory bandwidth).

kurosaki · Sep 2, 2020

eek2121 said:
Also can’t be compared to consoles. They could double up on RT performance since they have a much higher TDP to work with (not to mention memory bandwidth).

Double up? 640 Gigarays? Still far from 56 Terra rays.. yes, something must be off from the picture. Hope Lisa comes with a tease soon.

Stuka87 · Sep 2, 2020

kurosaki said:
Double up? 640 Gigarays? Still far from 56 Terra rays.. yes, something must be off from the picture. Hope Lisa comes with a tease soon.

This has been gone through before back when Turing came out. The way that nVidia measures a ray for that calculation has no basis in reality and how RT is used in games. So you are best off ignoring it and just go off in game performance.

leoneazzurro · Sep 2, 2020

The two architectures are not simply comparable. And Tera Rays? Nvidia would be really happy if their architecture is capable of Tera rays, they only said that "they have way more teraflops dedicated to ray tracing".

NostaSeronx · Sep 2, 2020

Isn't it;
Pascal: ~1 gigaray/s
Turing: ~10 gigaray/s
Ampere: ~20 gigaray/s

52 CUs of RDNA2(Anaconda/Arden(17h 80h-8Fh APU)) => 380 Gigaray-box/s or 95 Gigaray-triangle/s

With Xbox Series S(Lockhart/Mero(17h 98h-9Fh APU)) => 124 Gigaray-box/s or 31 Gigaray-triangle/s

DDH · Sep 2, 2020

Sony have mentioned bringing more exclusives from the ps5 to PC. I would love to see gran Turismo 7 running on the top end rdna2 GPU with maxed out RT

eek2121 · Sep 2, 2020

leoneazzurro said:
The two architectures are not simply comparable. And Tera Rays? Nvidia would be really happy if their architecture is capable of Tera rays, they only said that "they have way more teraflops dedicated to ray tracing".

This. Also, based on what I have seen thus far, RT performance is around 70% faster than turing.

The only actual negative AMD is likely to have is the lack of something similar to DLSS, although they may have a solution that hasn’t been shown yet.

Kenmitch · Sep 2, 2020

MrTeal said:
While I am by no means privy to the negotiations between AMD and MS/Sony, that seems to be giving a huge amount of clout to Lisa Su that I am not sure she has. Sony and MS's lack of pricing announcements is almost certainly about their competition with each other as opposed to them helping their chip supplier compete better in the desktop GPU market.

I was making pun at all the silly guys/gals spreading doom and gloom at AMD. They know who they are!

Nobody know what's up currently so my speculation is just as good as the next guys/gals....And the bots!

I guess we'll see how it pans out soon enough.

Martimus · Sep 2, 2020

eek2121 said:
Need I remind everyone that RDNA2 will be on a different 7nm process from the consoles?

What makes you say that? Why would AMD redesign their layout and change tooling between processes in such a short period of time? It seems like a waste of engineering resources to change the process on concurrent RDNA2 designs.

kurosaki · Sep 2, 2020

NostaSeronx said:
Isn't it;
Pascal: ~1 gigaray/s
Turing: ~10 gigaray/s
Ampere: ~20 gigaray/s

52 CUs of RDNA2(Anaconda/Arden(17h 80h-8Fh APU)) => 380 Gigaray-box/s or 95 Gigaray-triangle/s

With Xbox Series S(Lockhart/Mero(17h 98h-9Fh APU)) => 124 Gigaray-box/s or 31 Gigaray-triangle/s

Was just comparing to this:

Hitman928 · Sep 2, 2020

kurosaki said:
Was just comparing to this:

View attachment 29100

If you notice, the same row has a question mark for the 2080 Ti. So how is Ray Performance FLOPs measured or calculated? I honestly have no idea and it seems Ryan doesn't know either. It is certainly not rays/s and seems to be some Nvidia unique calculation that they invented as a way to have a quantitative yet ambiguous way of comparing Ampere chips.

Veradun · Sep 2, 2020

kurosaki said:
Was just comparing to this:

View attachment 29100

I'm confused: how do they translate rays to flops?

Konan · Sep 2, 2020

Sweet! I always assumed a HBM2 and a G6 set up

https://twitter.com/x/status/1301251483283980293

https://twitter.com/x/status/1301254012214403076

Konan · Sep 2, 2020

Very interesting

https://twitter.com/x/status/1301257760890388480

blckgrffn · Sep 2, 2020

Veradun said:
I'm confused: how do they translate rays to flops?

Math.

😂

In this case secret behind the scenes math. I think we can just trust them not to make up numbers, right?

🤔

lobz · Sep 2, 2020

kurosaki said:
View attachment 29097
2-4 RT cores per CU x ~2ghz is far from Nvidias claimed 50+TF in RT performance (320GF). By a magnitude, how is rdna2 and big Navi going to have a chance here? Starting to get a bit worried, or hopefully I have missed something in my calculations.

You're correct.

320 GigaFists (320 Billion fists - probably for hitting someone) is indeed a higher number than 2-4 sleeping pills, however, both can get you to sleep.

How do you know anything about how AMD's hardware unit compares to whatever number NVIDIA decides to conjure up? I mean if you know from somewhere that it's by a magnitude lower / weaker, I am genuinely curious. If you could provide me a link or just the part of your thought process where you were able to actually quantify both vendor's metrics, that'd be appreciated.

kurosaki · Sep 2, 2020

Veradun said:
I'm confused: how do they translate rays to flops?

Ok, Rays is my expression, Floating Point Operations per Second is quite universal. If AMD RT capability is 2-4 RT ops per CU, , then we have a theoretical number in flops. 320-640 floating point operations per seconds RT-wise. Nvidia claims to have double as many RT resources as Cuda cores by just looking at the official information.

But. By looking at that picture again it made me wonder. Bare with me guys!

It also tells us the Xbox xsxsxs have 256, 32 bit fp/s per CU, which translates to 256*80*2(ghz)= 40,9 TF in pure shader perf. Following that lead we can see it issues 7 instructions per clock/ CU. The RT unit mentioned was doing up to 4, but we need some texturing as well aye?! So let's say 2 to count conservatively. that leaves us with ~0,4 x shader performance. Oooor, 16 TF of RT performance.

Now as you have pointed out, there is no way we can compare apples to apples here, but the funny thing is im starting to see why Nvidia is going all out with everything this gen.

lobz · Sep 2, 2020

maddie said:
Nobody's doubting that SS is cheaper, but they are questioning your 30% cheaper and wondering why you would even make up a number if you don't know.

Tom / Moore's Law is Dead.

uzzi38 · Sep 2, 2020

Of course Samsung is cheaper per wafer, but have you taken into account yields, die sizes and additonal costs in terms of memory selection and board complexity?

Comparing TSMC and SS for Mvidia's usecase is much more simple. The first two alongside price per wafer is all that matters.

Comparing costs for AMD vs Nvidia? Whole different story.

Without knowing several things, such as the margins both are making, the difference in deals they're getting, 8LPP yields with such large dies, you name it, there's not a chance in hell we can make a decision on which is cheaper to produce.

kurosaki · Sep 2, 2020

lobz said:
You're correct.

320 GigaFists (320 Billion fists - probably for hitting someone) is indeed a higher number than 2-4 sleeping pills, however, both can get you to sleep.

How do you know anything about how AMD's hardware unit compares to whatever number NVIDIA decides to conjure up? I mean if you know from somewhere that it's by a magnitude lower / weaker, I am genuinely curious. If you could provide me a link or just the part of your thought process where you were able to actually quantify both vendor's metrics, that'd be appreciated.

I'm just doing my best to interpret the official documents listing official specs. That's all. A floating point operation is a floating point operation is a... With that said, we could have great differences in efficiency. An Nvidia flop could easily translate into 0,7 AMD-flops, or vice versa in real gaming scenarios due to a lot of factors. Do you remember the release of RDNA1? The amount of TF actually was lower than previous gen of arch, GCN, but still performed much better in gaming. RDNA is a highly developed platform for image rendering, it is a good thing.
I'm just a layman Anandtech member pondering about in lack of real leaks. But honestly, any one with further thoughts? : )

itsmydamnation · Sep 2, 2020

kurosaki said:
Ok, Rays is my expression, Floating Point Operations per Second is quite universal. If AMD RT capability is 2-4 RT ops per CU, , then we have a theoretical number in flops. 320-640 floating point operations per seconds RT-wise. Nvidia claims to have double as many RT resources as Cuda cores by just looking at the official information.

But. By looking at that picture again it made me wonder. Bare with me guys!

It also tells us the Xbox xsxsxs have 256, 32 bit fp/s per CU, which translates to 256*80*2(ghz)= 40,9 TF i pure shader perf. Following that lead we can see it issues 7 instructions per clock/ CU. The RT unit mentioned was doing up to 4, but we need some texturing as well aye?! So let's say 2 to count conservatively. that leaves us with ~0,4 x shader performance. Oooor, 16 TF of shader performance.

Now as you have pointed out, there is no way we can compare apples to apples here, but the funny thing is im starting to see why Nvidia is going all out with everything this gen.

If the Ray tracing unit only did 1 simple FP op a cycle there would be no point in it , you would just do those calculations on an ALU. So 1 RT op dispatched from the scheduler will have X amount of Flops , and X is probably variable depending complexity and test type.

Question Speculation: RDNA2 + CDNA Architectures thread

Platinum Member

Senior member

Senior member

Diamond Member

Diamond Member

Diamond Member

Senior member

Diamond Member

Golden Member

Diamond Member

Member

Diamond Member

Diamond Member

Diamond Member

Senior member

Diamond Member

Senior member

Senior member

Senior member

Diamond Member

Platinum Member

Senior member

Platinum Member

Platinum Member

Senior member

Diamond Member