Discussion RDNA 5 / UDNA (CDNA Next) speculation

Page 23 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

GodisanAtheist

Diamond Member
Nov 16, 2006
8,128
9,385
136
True, but almost certainly not abandoning IF caching schemes. GDDR7 alone cannot replace the bandwidth amplification of a large cache.

- New GDDR probably means they can shrink IC and claw back some die space though.

IC is a good crutch while AMD uses GDDR6, while NV went with more exotic RAM.

GDDR7 won't eliminate the need for IC, but will likely minimize it.
 
  • Like
Reactions: ToTTenTranz

maddie

Diamond Member
Jul 18, 2010
5,150
5,531
136
- New GDDR probably means they can shrink IC and claw back some die space though.

IC is a good crutch while AMD uses GDDR6, while NV went with more exotic RAM.

GDDR7 won't eliminate the need for IC, but will likely minimize it.
IC was used effectively even when NVIDIA was also on GDDR6.

"Magnus" could use 384bit + 48 MByte IF$ instead of 96 MByte. In the end a tradeoff between effective bandwidth and amount of VRAM.
I don't think cache can reduce the need for "amount of VRAM".

It amplifies bandwidth, so allows a smaller bus width. We should remember, the space needed for memory controllers is reduced, so this needs to be factored in the total cache+memory controllers area. The penalty might be a lot smaller than we think as logic shrinks faster than analog. Reduced power is another benefit.
 

reaperrr3

Member
May 31, 2024
107
330
96
IC is also more power efficient than VRAM access, this was one big factor of RDNA2's massive perf/W improvement.

48MB might be enough for 80 CUs running at more modest clocks in a console (remember previous console SoCs had no GPU IC at all), but for a 96 CU desktop GPU with 20+% higher CU IPC and clocked to 3+ GHz, 96MB might be necessary to hit their perf targets at 4K and RT/PT in general, despite GDDR7.

FWIW, Nvidia didn't back down on their L2 sizes with Blackwell despite GDDR7 either, and NV is no less margin-oriented than AMD.
 

basix

Member
Oct 4, 2024
155
309
96
I don't think cache can reduce the need for "amount of VRAM".

It amplifies bandwidth, so allows a smaller bus width.
No, IF$ cannot reduce the amount of needed VRAM.
But if you want a certain amount of VRAM, you need to have a certain bus width. If you widen the bus, you can reduce the size of IF$ without running into bandwidth bottlenecks. That's why I was talking about a tradeoff in this regard.
You could go the other way round: 128bit and 512MByte IF$. But the narrow bus width then limits the maximum VRAM capacity.

Other topics like energy efficiency and effective memory latency will get worse with wider bus width and less IF$. If those parameters are not your primary concern, a widened bus can be the right choice.
 
  • Like
Reactions: Mopetar

soresu

Diamond Member
Dec 19, 2014
3,899
3,331
136
IC is also more power efficient than VRAM access, this was one big factor of RDNA2's massive perf/W improvement.
Yes, but it's not a magic bullet for the same reason that people are bickering over the 8/16 GB cards issue at the moment.

More efficient software techniques (virtual texturing etc) and hardware µArch can mitigate this need to a limited extent, but it won't ever go away.

If they ever get around to stacking HBM directly on the GPU (or under) it will also mitigate some of the power efficiency and latency issues from VRAM access.
 

ToTTenTranz

Senior member
Feb 4, 2021
463
853
136
GDDR7 won't eliminate the need for IC, but will likely minimize it.
Truth be told, AMD has downscaled MALL-per-Performance a lot since RDNA2 already.
N21 had a whopping 128MB IC, then N31 ~50% was faster with 96MB albeit with a wider IC and VRAM bus, and now N48 is ~50% faster than N21 with only 64MB and same VRAM bus width (though clocked 25% faster).



The penalty might be a lot smaller than we think as logic shrinks faster than analog.
Cache area has also been practically stagnant across process nodes for a while and that is only changing after 3nm.


FWIW, Nvidia didn't back down on their L2 sizes with Blackwell despite GDDR7 either, and NV is no less margin-oriented than AMD.
Most probably because Nvidia was also planning for Blackwell to clock a lot higher than it did in the end, resulting in overkill effective bandwidth.
It was their Vega/RDNA3 moment.
 
  • Like
Reactions: GodisanAtheist

Tuna-Fish

Golden Member
Mar 4, 2011
1,650
2,481
136
Truth be told, AMD has downscaled MALL-per-Performance a lot since RDNA2 already.
N21 had a whopping 128MB IC, then N31 ~50% was faster with 96MB albeit with a wider IC and VRAM bus, and now N48 is ~50% faster than N21 with only 64MB and same VRAM bus width (though clocked 25% faster).

That's because MALL/performance is a completely nonsense thing that doesn't matter!

You do not need a specific amount of MALL for specific amount of performance. You need a specific amount of MALL for a given target render resolution. It doesn't matter how complex your scene is or how much time you spend in your shaders, the main bandwidth amplification is about getting your render target fit in cache across frames, so it never needs to hit ram. The only meaningful measure is is MALL/resolution. This is why MALL wasn't a thing before, you couldn't provide enough for the target resolution until cache got cheap enough.

MALL has slightly decreased since introduction because AMD has gotten better at optimizing. Both with better FB compression, and by better excluding things from being cached in the MALL.
 

itsmydamnation

Diamond Member
Feb 6, 2011
3,049
3,844
136
you can take that logic and shove it in the 8gb vram doesn't matter thread, these products should be built to perform as best as possible, period.
what? this makes no sense no matter what direction i approach it in

edit: you got 60 points to spend ( in effect cost) what is best?
1753329896444.png
 
Last edited: