Question Speculation: RDNA2 + CDNA Architectures thread

uzzi38 · Apr 28, 2020

All die sizes are within 5mm^2. The poster here has been right on some things in the past afaik, and to his credit was the first to saying 505mm^2 for Navi21, which other people have backed up. Even still though, take the following with a pich of salt.

Navi21 - 505mm^2

Navi22 - 340mm^2

Navi23 - 240mm^2

Source is the following post: https://www.ptt.cc/bbs/PC_Shopping/M.1588075782.A.C1E.html

Mopetar · Nov 1, 2020

Glo. said:
Did anybody noticed that Big Navi has over 50 mln xTors/mm2 transistor density?

🙂

25% increase, at the least ofer Navi 10.

And at that transistor density, its insane clock frequency...

What's the density of the infinity cache and how much of the die does it take up? I have a feeling that accounts for most of the gain. AMD is using the same node and the clocks are a lot higher. Normally increasing the density requires lowering clock speeds to offset the additional heat.

Timorous · Nov 1, 2020

Mopetar said:
What's the density of the infinity cache and how much of the die does it take up? I have a feeling that accounts for most of the gain. AMD is using the same node and the clocks are a lot higher. Normally increasing the density requires lowering clock speeds to offset the additional heat.

It didn't for Renoir. Clocks as high as Matisse and the vega cores clock far higher than radeon 7.

Shivansps · Nov 1, 2020

Zstream said:
You spend half a grand or more on a graphics card every two years and people have the nerve to complain about getting a motherboard? You’re just like the rest of the crowd.. wanting AMD to put pressure on Nvidia to lower prices while having no intention on buying AMD graphics.

Yeah i tend to trigger fanboys a lot to the point to tend to reply me with something that has nothing to do with what i said and are only intended to attack me. Thats not what im looking so thats not my fault.

TESKATLIPOKA · Nov 1, 2020

Glo. said:
Did anybody noticed that Big Navi has over 50 mln xTors/mm2 transistor density?

🙂

25% increase, at the least ofer Navi 10.

And at that transistor density, its insane clock frequency...

There is no official info about die size of Big Navi from AMD as far as I know. When the info will be out, we will know the actual transistor density.

Zstream · Nov 1, 2020

Shivansps said:
Yeah i tend to trigger fanboys a lot to the point to tend to reply me with something that has nothing to do with what i said and are only intended to attack me. Thats not what im looking so thats not my fault.

You clearly don’t know my posting history, or what I have in my rig. I had two 2060’s, and was almost in the hands of a 3070. It has not thing to do with loyalty.

As you suggest in your post, you’re not even the halfway target market for this stuff, yet complaining about motherboard cost which is an extreme fraction of PC ownership. It’s not my fault people spent $350+ on some x470 that a $175 board would have provided them with the same specs and performance as they’re getting now.

GoodRevrnd · Nov 1, 2020

PCIe 4.0 would be required for SAM yeah? Trying to plan build, considering sub 10L mini ITX but case I'm looking at doesn't have a 4.0 riser.

Asterox · Nov 1, 2020

As I wrote earlier, "My favorite leaker is David Wang, but he is mostly ignored by everyone".In his presentation from March 2020, well we can see this slide.When the rumors about some weird 128mb Infinity Cache started, most people have completely forgotten about this "green nonsense".Can we logically connect this green stuff with early Infinity Cache rumors, hm judge for yourself.

Page 100 - Question - Speculation: RDNA2 + CDNA Architectures thread

Page 100 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

forums.anandtech.com

RDNA2 vs Ampere Cache configuration, hm the difference is quite obvious.

AMD Radeon RX 6800 XT Specs

AMD Navi 21, 2250 MHz, 4608 Cores, 288 TMUs, 128 ROPs, 16384 MB GDDR6, 2000 MHz, 256 bit

www.techpowerup.com

"RDNA introduced the "workgroup processor" ("WGP"). The WGP replaces the compute unit as the basic unit of shader computation hardware/computing. One WGP encompasses 2 CUs"

NVIDIA GeForce RTX 3080 Specs

NVIDIA GA102, 1710 MHz, 8704 Cores, 272 TMUs, 96 ROPs, 10240 MB GDDR6X, 1188 MHz, 320 bit

www.techpowerup.com

zinfamous · Nov 1, 2020

GoodRevrnd said:
PCIe 4.0 would be required for SAM yeah? Trying to plan build, considering sub 10L mini ITX but case I'm looking at doesn't have a 4.0 riser.

now I don't know. People have said that "For sure, absolutely." and also, people are now saying that SAM ain't no new thing, anyone can do it and has been doing, and it's just a simple software thing.

...I don't think both can be true, right? (meaning, PCIE 4 is way too new and limited for the 2nd argument to exist in a world that never really had it before)

Krteq · Nov 1, 2020

No way... dafuq is going on here?!?!?

https://twitter.com/x/status/1322237836289069057

Well, let's wait for independent reviews

PhoBoChai · Nov 1, 2020

Asterox said:
RDNA2 vs Ampere Cache configuration, hm the difference is quite obvious.

AMD Radeon RX 6800 XT Specs

AMD Navi 21, 2250 MHz, 4608 Cores, 288 TMUs, 128 ROPs, 16384 MB GDDR6, 2000 MHz, 256 bit

www.techpowerup.com

"RDNA introduced the "workgroup processor" ("WGP"). The WGP replaces the compute unit as the basic unit of shader computation hardware/computing. One WGP encompasses 2 CUs"

View attachment 32738

NVIDIA GeForce RTX 3080 Specs

NVIDIA GA102, 1710 MHz, 8704 Cores, 272 TMUs, 96 ROPs, 10240 MB GDDR6X, 1188 MHz, 320 bit

www.techpowerup.com

View attachment 32739

Those specs are misleading, because the RTX Ampere SM L1 cache isn't 128KB fully.

It depends on the workload. In CUDA, its 96KB shared memory + 32KB L1 Data.

In graphics, its 48KB shared memory + 16KB for 2x FP32 control + 32KB L1 Data + 32KB Texture.

It's actually pretty small when compared to RDNA 1 even. Because it has additional cache not listed there.

For example, the TMUs in a CU have their own pool of cache. The CUs have their own shared memory, which AMD calls the Local Data Share (LDS), which IIRC is 128KB. The Scalar ALUs have their own cache.

On NV, everything is cramming the shared memory or L1 and in gaming, that shared memory is tiny.

Edit: That is why some devs explained RTX Ampere RT performance is great in apps, while being similar to Turing, relatively for gaming. Shared memory goes from 96KB for raw compute, to only 48KB for graphics. And obviously we now understand that RT is very cache intensive and NV couldn't cram more cache onto gaming Ampere due to worse Samsung 8N. GA100 on 7N has a huge 192KB L1 and like 10x bigger L2. NerdTechGasm posted a vid awhile back, b4 the launch and reviews of RTX 3080 series that explain this and said not to expect major uplift in RT gaming perf, but only in workstation apps.

Glo. · Nov 1, 2020

Krteq said:
No way... dafuq is going on here?!?!?

https://twitter.com/x/status/1322237836289069057

Well, let's wait for independent reviews

Why would it be "no way"?

AnandThenMan · Nov 1, 2020

Almost 1.5x the 2080Ti, dunno man.

moinmoin · Nov 1, 2020

AnandThenMan said:
Almost 1.5x the 2080Ti, dunno man.

AMD fine wine at launch already this time? (Let's hope it doesn't turn into vinegar later then.)

Kuiva maa · Nov 1, 2020

moinmoin said:
AMD fine wine at launch already this time? (Let's hope it doesn't turn into vinegar later then.)

Which radeon chip was the ultimate fine wine example? Hawaii. Why? Because it shared architecture with PS4 and entered the market exactly at the same time (Q4 2013). A 290 was essentially a 2xPS4 GPU.
The 6800XT shares architecture with PS5 and it enters the market exactly at the same time (Q4 2020). The 6800XT is also exactly a 2xPS5 configuration. I expect this card to be still relevant 5 years from now.

Glo. · Nov 1, 2020

kurosaki · Nov 1, 2020

Kuiva maa said:
Which radeon chip was the ultimate fine wine example? Hawaii. Why? Because it shared architecture with PS4 and entered the market exactly at the same time (Q4 2013). A 290 was essentially a 2xPS4 GPU.
The 6800XT shares architecture with PS5 and it enters the market exactly at the same time (Q4 2020). The 6800XT is also exactly a 2xPS5 configuration. I expect this card to be still relevant 5 years from now.

Its going to be relevant as long as the ps5/xbsex that's for sure

PhoBoChai · Nov 1, 2020

Kuiva maa said:
Which radeon chip was the ultimate fine wine example? Hawaii. Why? Because it shared architecture with PS4 and entered the market exactly at the same time (Q4 2013). A 290 was essentially a 2xPS4 GPU.
The 6800XT shares architecture with PS5 and it enters the market exactly at the same time (Q4 2020). The 6800XT is also exactly a 2xPS5 configuration. I expect this card to be still relevant 5 years from now.

I've seen some ridiculous reach or cope lately, ppl actually believe that RDNA2 shared with consoles won't gain AMD any advantage because GCN shared with older consoles didn't... Except it did, has and still continues to benefit PC GCN.

Tremendously so.

The fact that GCN (compute + datacenter architecture) is able to stay somewhat competitive to NV's tech is purely due to console optimizations.

Example: When Resident Evil 3, or Forza (or many other console ports) came to PC, and all the AMD GPUs were punching above their class, what do you think is the cause? AMD's amazing driver team, or that Capcom (& other studios) optimizing the crap out of GCN for PS4 base spec to get good FPS?

AMD has basically gained FREE game engine optimizations in many AAA titles just from being the baseline for devs to target. NV has had to rely on sponsorship programs for the PC port to get great performance from their own GPUs.

Shivansps · Nov 1, 2020

zinfamous said:
now I don't know. People have said that "For sure, absolutely." and also, people are now saying that SAM ain't no new thing, anyone can do it and has been doing, and it's just a simple software thing.

...I don't think both can be true, right? (meaning, PCIE 4 is way too new and limited for the 2nd argument to exist in a world that never really had it before)

Resizable BAR it is enabled at both GPU firmware and GPU drivers, there are plenty of oportunities to do artificial blocks, but in no way it depends on PCI-E 4 as in Windows it is a WDDM2 driver model feature. WDDM 2.0 it is as old as Windows 10 and DX12, we are talking about a six year old thing here.

But not sure why everyone assumes it is just the BAR thing, it could be something else.

Glo. · Nov 1, 2020

I know this question will sound incredibly stupid but what exactly the short for "BAR" means?

zinfamous · Nov 1, 2020

Glo. said:
I know this question will sound incredibly stupid but what exactly the short for "BAR" means?

I was also very confused. I'm not sure how to read the first part of that sentence...grammatically, and then the rest of it. This is mostly my ignorance, I think, because I just don't speak this (silicon) language very well.

Shivansps · Nov 1, 2020

PhoBoChai said:
I've seen some ridiculous reach or cope lately, ppl actually believe that RDNA2 shared with consoles won't gain AMD any advantage because GCN shared with older consoles didn't... Except it did, has and still continues to benefit PC GCN.

Tremendously so.

The fact that GCN (compute + datacenter architecture) is able to stay somewhat competitive to NV's tech is purely due to console optimizations.

Example: When Resident Evil 3, or Forza (or many other console ports) came to PC, and all the AMD GPUs were punching above their class, what do you think is the cause? AMD's amazing driver team, or that Capcom (& other studios) optimizing the crap out of GCN for PS4 base spec to get good FPS?

AMD has basically gained FREE game engine optimizations in many AAA titles just from being the baseline for devs to target. NV has had to rely on sponsorship programs for the PC port to get great performance from their own GPUs.

Yeah, exactly this, this is also why AMD RT and "DLSS" tech will be far more popular than Nvidias. There is no way around that.

PhoBoChai · Nov 1, 2020

kurosaki said:
Its going to be relevant as long as the ps5/xbsex that's for sure

Even PS4 and first gen Xbox 1 still relevant now. I was watching the new Watch Dogs Legion analysis from DF and they showcase the impressive results from base consoles for one of the most complex open world games.

Shivansps · Nov 1, 2020

Glo. said:
I know this question will sound incredibly stupid but what exactly the short for "BAR" means?

It is a PCI thing, this is a very very old thing that goes back to PCI (not PCI-E, PCI) design. Remember the "Graphic Aperture Size" in bios on old motherboards? thats the BAR size for the GPU, that was present on AGP and it was up to 256MB.

PCI devices map their registers and internal memory into the IO address space of the CPU, and this is used for the CPU and the device driver to actually conmunicate, that is called BAR, i just dont know exactly what BAR means.

And for resizable BAR, Windows added support for that on W10 WDDM 2.0, and Nvidia supported that since Kepler i belive, but they only allow it on Teslas, Titans and Quadros, as it is used to boost CUDA performance. It is blocked on consumer gpus.

It would be dumb for AMD to support this only on RDNA2, PCI-E 4 and Ryzen 5000 because Nvidia could add support for every CPU and GPU out there, altrought the GPU may need a firmware update.

As for games taking advantage of this... i dont know, it is a advantage for CUDA, game engines would need to be modified to take advantage of this, it may be usefull for RT/DLSS type of application. Lets remember that every PCI-E device has DMA to the entire physical memory, but for the CPU, driver and software to read memory of the GPU(or any other PCI device) they can only do it using the mapped IO addresses of the device (BAR). So we are talking about things that would be faster if we can actually read the GPU memory directly via BAR instead of copying it to RAM, so you can see it is mostly a compute thing. I could this this making faster the video encode/decode process.

Glo. · Nov 1, 2020

Shivansps said:
It is a PCI thing, this is a very very old thing that goes back to PCI (not PCI-E, PCI) design. Remember the "Graphic Aperture Size" in bios on old motherboards? thats the BAR size for the GPU, that was present on AGP and it was up to 256MB.

PCI devices map their registers and internal memory into the IO address space of the CPU, and this is used for the CPU and the device driver to actually conmunicate, that is called BAR, i just dont know exactly what BAR means.

And for resizable BAR, Windows added support for that on W10 WDDM 2.0, and Nvidia supported that since Kepler i belive, but they only allow it on Teslas, Titans and Quadros, as it is used to boost CUDA performance. It is blocked on consumer gpus.

It would be dumb for AMD to support this only on RDNA2, PCI-E 4 and Ryzen 5000 because Nvidia could add support for every CPU and GPU out there, altrought the GPU may need a firmware update.

As for games taking advantage of this... i dont know, it is a advantage for CUDA, game engines would need to be modified to take advantage of this, it may be usefull for RT/DLSS type of application.

You still haven't answered my question 🙂.

What is "BAR" short for?

Big Army Robot? Being Almost Right?

Break Amy's Room? 😛

Shivansps · Nov 1, 2020

Glo. said:
You still haven't answered my question 🙂.

What does "BAR" short stand for?

Big Army Robot? Being Almost Right?

Break Amy's Room? 😛

Base Address Register, thanks google.

Question Speculation: RDNA2 + CDNA Architectures thread

Platinum Member

Diamond Member

Golden Member

Diamond Member

Platinum Member

Diamond Member

Diamond Member

Golden Member

No Lifer

Golden Member

Member

Diamond Member

Diamond Member

Diamond Member

Member

Diamond Member

Senior member

Member

Diamond Member

Diamond Member

No Lifer

Diamond Member

Member

Diamond Member

Diamond Member

Diamond Member