• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Question Speculation: RDNA2 + CDNA Architectures thread

Page 178 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Did anybody noticed that Big Navi has over 50 mln xTors/mm2 transistor density?

🙂

25% increase, at the least ofer Navi 10.

And at that transistor density, its insane clock frequency...

What's the density of the infinity cache and how much of the die does it take up? I have a feeling that accounts for most of the gain. AMD is using the same node and the clocks are a lot higher. Normally increasing the density requires lowering clock speeds to offset the additional heat.
 
What's the density of the infinity cache and how much of the die does it take up? I have a feeling that accounts for most of the gain. AMD is using the same node and the clocks are a lot higher. Normally increasing the density requires lowering clock speeds to offset the additional heat.

It didn't for Renoir. Clocks as high as Matisse and the vega cores clock far higher than radeon 7.
 
You spend half a grand or more on a graphics card every two years and people have the nerve to complain about getting a motherboard? You’re just like the rest of the crowd.. wanting AMD to put pressure on Nvidia to lower prices while having no intention on buying AMD graphics.

Yeah i tend to trigger fanboys a lot to the point to tend to reply me with something that has nothing to do with what i said and are only intended to attack me. Thats not what im looking so thats not my fault.

s4YCr2R.png
 
Did anybody noticed that Big Navi has over 50 mln xTors/mm2 transistor density?

🙂

25% increase, at the least ofer Navi 10.

And at that transistor density, its insane clock frequency...
There is no official info about die size of Big Navi from AMD as far as I know. When the info will be out, we will know the actual transistor density.
 
Yeah i tend to trigger fanboys a lot to the point to tend to reply me with something that has nothing to do with what i said and are only intended to attack me. Thats not what im looking so thats not my fault.

s4YCr2R.png
You clearly don’t know my posting history, or what I have in my rig. I had two 2060’s, and was almost in the hands of a 3070. It has not thing to do with loyalty.

As you suggest in your post, you’re not even the halfway target market for this stuff, yet complaining about motherboard cost which is an extreme fraction of PC ownership. It’s not my fault people spent $350+ on some x470 that a $175 board would have provided them with the same specs and performance as they’re getting now.
 
PCIe 4.0 would be required for SAM yeah? Trying to plan build, considering sub 10L mini ITX but case I'm looking at doesn't have a 4.0 riser.
 
As I wrote earlier, "My favorite leaker is David Wang, but he is mostly ignored by everyone".In his presentation from March 2020, well we can see this slide.When the rumors about some weird 128mb Infinity Cache started, most people have completely forgotten about this "green nonsense".Can we logically connect this green stuff with early Infinity Cache rumors, hm judge for yourself.


2020-10-31_221228.jpg

RDNA2 vs Ampere Cache configuration, hm the difference is quite obvious.


"RDNA introduced the "workgroup processor" ("WGP"). The WGP replaces the compute unit as the basic unit of shader computation hardware/computing. One WGP encompasses 2 CUs"

2020-11-01_213621.jpg


2020-11-01_214314.jpg
 
PCIe 4.0 would be required for SAM yeah? Trying to plan build, considering sub 10L mini ITX but case I'm looking at doesn't have a 4.0 riser.

now I don't know. People have said that "For sure, absolutely." and also, people are now saying that SAM ain't no new thing, anyone can do it and has been doing, and it's just a simple software thing.

...I don't think both can be true, right? (meaning, PCIE 4 is way too new and limited for the 2nd argument to exist in a world that never really had it before)
 
RDNA2 vs Ampere Cache configuration, hm the difference is quite obvious.


"RDNA introduced the "workgroup processor" ("WGP"). The WGP replaces the compute unit as the basic unit of shader computation hardware/computing. One WGP encompasses 2 CUs"

View attachment 32738


View attachment 32739

Those specs are misleading, because the RTX Ampere SM L1 cache isn't 128KB fully.

It depends on the workload. In CUDA, its 96KB shared memory + 32KB L1 Data.

In graphics, its 48KB shared memory + 16KB for 2x FP32 control + 32KB L1 Data + 32KB Texture.

It's actually pretty small when compared to RDNA 1 even. Because it has additional cache not listed there.

For example, the TMUs in a CU have their own pool of cache. The CUs have their own shared memory, which AMD calls the Local Data Share (LDS), which IIRC is 128KB. The Scalar ALUs have their own cache.

On NV, everything is cramming the shared memory or L1 and in gaming, that shared memory is tiny.

Edit: That is why some devs explained RTX Ampere RT performance is great in apps, while being similar to Turing, relatively for gaming. Shared memory goes from 96KB for raw compute, to only 48KB for graphics. And obviously we now understand that RT is very cache intensive and NV couldn't cram more cache onto gaming Ampere due to worse Samsung 8N. GA100 on 7N has a huge 192KB L1 and like 10x bigger L2. NerdTechGasm posted a vid awhile back, b4 the launch and reviews of RTX 3080 series that explain this and said not to expect major uplift in RT gaming perf, but only in workstation apps.

 
Last edited:
AMD fine wine at launch already this time? (Let's hope it doesn't turn into vinegar later then.)

Which radeon chip was the ultimate fine wine example? Hawaii. Why? Because it shared architecture with PS4 and entered the market exactly at the same time (Q4 2013). A 290 was essentially a 2xPS4 GPU.
The 6800XT shares architecture with PS5 and it enters the market exactly at the same time (Q4 2020). The 6800XT is also exactly a 2xPS5 configuration. I expect this card to be still relevant 5 years from now.
 
Which radeon chip was the ultimate fine wine example? Hawaii. Why? Because it shared architecture with PS4 and entered the market exactly at the same time (Q4 2013). A 290 was essentially a 2xPS4 GPU.
The 6800XT shares architecture with PS5 and it enters the market exactly at the same time (Q4 2020). The 6800XT is also exactly a 2xPS5 configuration. I expect this card to be still relevant 5 years from now.
Its going to be relevant as long as the ps5/xbsex that's for sure
 
Which radeon chip was the ultimate fine wine example? Hawaii. Why? Because it shared architecture with PS4 and entered the market exactly at the same time (Q4 2013). A 290 was essentially a 2xPS4 GPU.
The 6800XT shares architecture with PS5 and it enters the market exactly at the same time (Q4 2020). The 6800XT is also exactly a 2xPS5 configuration. I expect this card to be still relevant 5 years from now.

I've seen some ridiculous reach or cope lately, ppl actually believe that RDNA2 shared with consoles won't gain AMD any advantage because GCN shared with older consoles didn't... Except it did, has and still continues to benefit PC GCN.

Tremendously so.

The fact that GCN (compute + datacenter architecture) is able to stay somewhat competitive to NV's tech is purely due to console optimizations.

Example: When Resident Evil 3, or Forza (or many other console ports) came to PC, and all the AMD GPUs were punching above their class, what do you think is the cause? AMD's amazing driver team, or that Capcom (& other studios) optimizing the crap out of GCN for PS4 base spec to get good FPS?

AMD has basically gained FREE game engine optimizations in many AAA titles just from being the baseline for devs to target. NV has had to rely on sponsorship programs for the PC port to get great performance from their own GPUs.
 
now I don't know. People have said that "For sure, absolutely." and also, people are now saying that SAM ain't no new thing, anyone can do it and has been doing, and it's just a simple software thing.

...I don't think both can be true, right? (meaning, PCIE 4 is way too new and limited for the 2nd argument to exist in a world that never really had it before)

Resizable BAR it is enabled at both GPU firmware and GPU drivers, there are plenty of oportunities to do artificial blocks, but in no way it depends on PCI-E 4 as in Windows it is a WDDM2 driver model feature. WDDM 2.0 it is as old as Windows 10 and DX12, we are talking about a six year old thing here.

But not sure why everyone assumes it is just the BAR thing, it could be something else.
 
I know this question will sound incredibly stupid but what exactly the short for "BAR" means?

I was also very confused. I'm not sure how to read the first part of that sentence...grammatically, and then the rest of it. This is mostly my ignorance, I think, because I just don't speak this (silicon) language very well.
 
I've seen some ridiculous reach or cope lately, ppl actually believe that RDNA2 shared with consoles won't gain AMD any advantage because GCN shared with older consoles didn't... Except it did, has and still continues to benefit PC GCN.

Tremendously so.

The fact that GCN (compute + datacenter architecture) is able to stay somewhat competitive to NV's tech is purely due to console optimizations.

Example: When Resident Evil 3, or Forza (or many other console ports) came to PC, and all the AMD GPUs were punching above their class, what do you think is the cause? AMD's amazing driver team, or that Capcom (& other studios) optimizing the crap out of GCN for PS4 base spec to get good FPS?

AMD has basically gained FREE game engine optimizations in many AAA titles just from being the baseline for devs to target. NV has had to rely on sponsorship programs for the PC port to get great performance from their own GPUs.

Yeah, exactly this, this is also why AMD RT and "DLSS" tech will be far more popular than Nvidias. There is no way around that.
 
Its going to be relevant as long as the ps5/xbsex that's for sure

Even PS4 and first gen Xbox 1 still relevant now. I was watching the new Watch Dogs Legion analysis from DF and they showcase the impressive results from base consoles for one of the most complex open world games.
 
I know this question will sound incredibly stupid but what exactly the short for "BAR" means?

It is a PCI thing, this is a very very old thing that goes back to PCI (not PCI-E, PCI) design. Remember the "Graphic Aperture Size" in bios on old motherboards? thats the BAR size for the GPU, that was present on AGP and it was up to 256MB.

PCI devices map their registers and internal memory into the IO address space of the CPU, and this is used for the CPU and the device driver to actually conmunicate, that is called BAR, i just dont know exactly what BAR means.

And for resizable BAR, Windows added support for that on W10 WDDM 2.0, and Nvidia supported that since Kepler i belive, but they only allow it on Teslas, Titans and Quadros, as it is used to boost CUDA performance. It is blocked on consumer gpus.

It would be dumb for AMD to support this only on RDNA2, PCI-E 4 and Ryzen 5000 because Nvidia could add support for every CPU and GPU out there, altrought the GPU may need a firmware update.

As for games taking advantage of this... i dont know, it is a advantage for CUDA, game engines would need to be modified to take advantage of this, it may be usefull for RT/DLSS type of application. Lets remember that every PCI-E device has DMA to the entire physical memory, but for the CPU, driver and software to read memory of the GPU(or any other PCI device) they can only do it using the mapped IO addresses of the device (BAR). So we are talking about things that would be faster if we can actually read the GPU memory directly via BAR instead of copying it to RAM, so you can see it is mostly a compute thing. I could this this making faster the video encode/decode process.
 
Last edited:
It is a PCI thing, this is a very very old thing that goes back to PCI (not PCI-E, PCI) design. Remember the "Graphic Aperture Size" in bios on old motherboards? thats the BAR size for the GPU, that was present on AGP and it was up to 256MB.

PCI devices map their registers and internal memory into the IO address space of the CPU, and this is used for the CPU and the device driver to actually conmunicate, that is called BAR, i just dont know exactly what BAR means.

And for resizable BAR, Windows added support for that on W10 WDDM 2.0, and Nvidia supported that since Kepler i belive, but they only allow it on Teslas, Titans and Quadros, as it is used to boost CUDA performance. It is blocked on consumer gpus.

It would be dumb for AMD to support this only on RDNA2, PCI-E 4 and Ryzen 5000 because Nvidia could add support for every CPU and GPU out there, altrought the GPU may need a firmware update.

As for games taking advantage of this... i dont know, it is a advantage for CUDA, game engines would need to be modified to take advantage of this, it may be usefull for RT/DLSS type of application.
You still haven't answered my question 🙂.

What is "BAR" short for?

Big Army Robot? Being Almost Right?

Break Amy's Room? 😛
 
Back
Top