Discussion RDNA 5 / UDNA (CDNA Next) speculation

Page 69 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

MrMPFR

Member
Aug 9, 2025
139
278
96
"On-load" is a fallback mechanism for platforms which can't execute "on-sample" fast enough. But you still use NTC in your game and engine.

On-feedback​

On-load is a last resort for platforms with lack of sampler feedback. that means Pre-Turing cards, Pre-RDNA 2 and PS5. All newer cards can use the "on feedback" mode to get "Much lower VRAM usage compared to fully mapped textures". No inference overhead at all.

Here's the GDC 2025 link again:

On feedback is the real fallback option, not on-load, and it'll work on anything from the Switch 2, RTX 2050 mobile, 1650, and RX 6600 to 9070 XT, 5090, and the B580.

1761489030877.png
From the NTC Github page.

Variables for on-feedback math
If we take NVIDIA at their word NTC is ~6.6X vs BCn. Figure from ^ and the Compusemble vid and for a multi-layered PBR material.
For SFS IIRC Microsoft claimed 2.5X VRAM multiplier. Let's use that for On sample as well.
For texture portion of VRAM let's say 50-70%.

On feedback math:
Let's start at 8 GB
50 to 70% textures gives us 4 to 5.6 GB texture data

Sampler feedback
2.5X VRAM multiplier = 2.5X fewer BCn textures.
4 to 5.6 GB / 2.5X = 1.6 to 2.24 GB BCn textures
and
2.4 to 3.36 GB NTC textures

NTC impact
Applying 6.6X to NTC textures
2.4 - 3.36GB / 6.6 = 0.36 to 0.51

Subtracting from NTC total
2.4 GB - 0.36 GB = 2.04 GB saved
3.36 GB - 0.51 GB = 2.85 GB saved

Savings in percent
2.04 GB / 8 GB x 100 = 25.5% saved
2.85 GB / 8 GB x 100 = 35.6% saved

^Result: ~25.5 to 35.6% lower total VRAM usage or effective 1.34 to 1.55X VRAM multiplier.
Now 5 to 6 GB cards functions as 8 GB or 8 GB = 10.72 to 12.4 GB.

On-sample math

For on-sample we can calculate the NTC saving on the entire 4 to 5.6 GB texture VRAM allocation BCn.
4 GB / 6.6 = 0.61 GB
5.6 GB / 6.6 = 0.85 GB
This is 0.61 to 0.85 GB textures with NTC

Subtracting from BCn

4 GB - 0.61 GB = 3.39 GB
5.6 GB - 0.85 GB = 4.75 GB

Savings in percent
3.39 GB / 8 GB x 100 = 42.4% saved
4.75 GB / 8 GB x 100 = 59.4% saved

^Result: ~42.4 to 59.4% lower total VRAM usage or effective 1.74 to 2.46X VRAM multiplier.
Now 3 to 5 GB cards functions as 8 GB or 8 GB = 13.92 to 19.68 GB

Implications and conclusion

I know this is just a guesstimate but it's still fun to guess and this was an eye opener. NTC is going to be a big deal even during the PS5/PS6 crossgen era. Other IHVs will have their own SDKs.

With "on-feedback" gamers can enjoy some of the benefits of "on sample" without the downsides. Since PS5 is baseline during crossgen I expect future games that leverage NTC to benefit 8GB and 12GB cards greatly.

Inference on sample is really meant for the post-crossgen era and permitting 8-12GB cards to use very VRAM demanding settings. On powerful cards it makes no sense as these usually have plenty of VRAM and even if they don't (5070) on-feedback should be enough in most cases.
 
Last edited:

Darkmont

Member
Jul 7, 2023
98
301
106
GDDR has to take PHYDES liberties that lower throughput per pin memories like DDR or LPDDR don't have to. The extremely high clock rates also mean your circuitry to drive those signals will have to be larger, reducing your array efficiency (how much of your die space is taken up by cells vs logic and misc parts).
1761496537445.png
 

Win2012R2

Golden Member
Dec 5, 2024
1,236
1,276
96
On feedback is the real fallback option, not on-load, and it'll work on anything from the Switch 2, RTX 2050 mobile, 1650, and RX 6600 to 9070 XT, 5090, and the B580.
And that won't lead to entire new class of stutterfests?

All this neutral texture compression crap is a very slippery slope, it's bad enough already with upscaling which at least can be turned off or kept at "quality" level on high end PCs.

Pretty surprised that PS6 won't have separate banks of DDR5 (say 32 GB - dirt cheap) and GDDR7 (say 24 GB), obviously devs hate this stuff but it will be a lot more RAM overall and plenty of VRAM for when it's needed. I guess it would have made SoC more custom than Sony would like, hope they'll at least get 32 GB, but given where RAM (and NAND) prices are going this might all push release to 2028.

Reducing the package from let's say 200GB to 50GB would be welcomed by many people.

They'll just add x10 more cosmetics for micro-transactions and it will be back up to 200 GB in no time. Plus games are pretty expensive these days, so paying $80 and downloading whopping 200 GB makes it feel reassuringly expensive, and if it's 50 GB then it's what: small game = rip off?
 

tsamolotoff

Senior member
May 19, 2019
259
510
136
This just like the proverbial ship of Theseus, but missing parts of it are replaced not with the actual wooden planks, scraps of cloth etc, but with AI generated 'something' that quite obviously does not look or feel the same as the replaced part, but you are getting gaslit that it's not only the same, but better /s
it's bad enough already with upscaling which at least can be turned off
How so? There's been like a dozen AAA games released since the RTRT push started that allow you to disable TAA and use either SMAA or no antialiasing at all (most of those are ports done by Nixxes, BF6 and a few odd UE5 titles like Cronos New Dawn), everything else uses incoherent, smeary slop that burns my eyes in 20 seconds reconstructed frames that were created by slapping together up to 16 frames. I think there should be new metric like "clarity-adjusted FPS" that takes into account all these time-domain tricks that nVidia and the others substitute real HW progress with. Like this beauty:

 

ToTTenTranz

Senior member
Feb 4, 2021
797
1,292
136
On feedback is the real fallback option, not on-load, and it'll work on anything from the Switch 2, RTX 2050 mobile, 1650, and RX 6600 to 9070 XT, 5090, and the B580.

The Switch 2 doesn't even have enough tensor throughput for running DLSS in a bunch of games, let alone doing neural texture generation to save RAM.
Though it's not like the Switch 2 is missing RAM for its specs, to be honest.



It also doesn't look like NTC is coming to a commercial game on time to be useful for the current crop of >$200 8GB GPUs (and almost all laptop dGPUs).
 

dangerman1337

Senior member
Sep 16, 2010
411
57
91
Pretty surprised that PS6 won't have separate banks of DDR5 (say 32 GB - dirt cheap) and GDDR7 (say 24 GB), obviously devs hate this stuff but it will be a lot more RAM overall and plenty of VRAM for when it's needed. I guess it would have made SoC more custom than Sony would like, hope they'll at least get 32 GB, but given where RAM (and NAND) prices are going this might all push release to 2028.
AFAIK PS6 according to MLID says PS6 Home Console as 160-bit bus so it's either going to be 30GB Clamshell (which Kepler L2 thinks) or 40GB Clamshell. I mean 40GB is a possibility since Sony went to 4 to 8GB with the PS4 despite having to convert to a clamshell design. So changing the memory modules from 3 to 4GB modules with the current planned PS6 Home Console is probably quite easy in comparison.
 

Tuna-Fish

Golden Member
Mar 4, 2011
1,685
2,572
136
That entirely depends on whether the 32Gbit memory modules are available in sufficient volumes at that time. That's not up to Sony.
 

jpiniero

Lifer
Oct 1, 2010
16,985
7,386
136
AFAIK PS6 according to MLID says PS6 Home Console as 160-bit bus so it's either going to be 30GB Clamshell (which Kepler L2 thinks) or 40GB Clamshell. I mean 40GB is a possibility since Sony went to 4 to 8GB with the PS4 despite having to convert to a clamshell design. So changing the memory modules from 3 to 4GB modules with the current planned PS6 Home Console is probably quite easy in comparison.

Sony released a revised PS4 in mid 2015 which used 1 GB chips instead (and no clamshell). So they had a plan to deal with that. I am not sure that could happen this time.

Using 160-bit would mean that it would only have marginally higher memory bandwidth than the PS5 Pro.
 

dangerman1337

Senior member
Sep 16, 2010
411
57
91
Sony released a revised PS4 in mid 2015 which used 1 GB chips instead (and no clamshell). So they had a plan to deal with that. I am not sure that could happen this time.

Using 160-bit would mean that it would only have marginally higher memory bandwidth than the PS5 Pro.
Yeah true, though that would mean like waiting early 2030s or so (since 4GB GDDR7 modules seem to be available for 2027 products in general if that leaked RDNA 5 roadmap with 128GB AT0 SKU) for 6GB modules becoming avaliable? And will that even happen for GDDR7 or just become a GDDR8 only thing?
 

ToTTenTranz

Senior member
Feb 4, 2021
797
1,292
136
AFAIK PS6 according to MLID says PS6 Home Console as 160-bit bus so it's either going to be 30GB Clamshell (which Kepler L2 thinks) or 40GB Clamshell. I mean 40GB is a possibility since Sony went to 4 to 8GB with the PS4 despite having to convert to a clamshell design. So changing the memory modules from 3 to 4GB modules with the current planned PS6 Home Console is probably quite easy in comparison.

I get that nvidia kind of conformed everyone with the idea that zero (or even negative) RAM amount upgrades across generations were acceptable, but I honestly think 30GB is going to sound very short for a home console in 2027.
The novelty in the next generation of consoles is going to be to run AI models for everything. With MoE approaches they can speed up inference a lot, but they must still be able to fit all of them inside the RAM.
 
  • Like
Reactions: CakeMonster

jpiniero

Lifer
Oct 1, 2010
16,985
7,386
136
The novelty in the next generation of consoles is going to be to run AI models for everything. With MoE approaches they can speed up inference a lot, but they must still be able to fit all of them inside the RAM.

That's a little, um... optimistic. Seems more like the next gen is going to be mainly... Ray Tracing and Upscaling and Frame Gen.
 

ToTTenTranz

Senior member
Feb 4, 2021
797
1,292
136
That's a little, um... optimistic. Seems more like the next gen is going to be mainly... Ray Tracing and Upscaling and Frame Gen.

Neural Networks are already used for those three. And then there's Neural Textures and Neural Texture Compression.

It's still just 2025. Imagine what'll be by 2030, for example. Sony and Microsoft are designing hardware that might be getting games up to 2035 at least, probably until 2040 for EA sports games.
 

dangerman1337

Senior member
Sep 16, 2010
411
57
91
I get that nvidia kind of conformed everyone with the idea that zero (or even negative) RAM amount upgrades across generations were acceptable, but I honestly think 30GB is going to sound very short for a home console in 2027.
The novelty in the next generation of consoles is going to be to run AI models for everything. With MoE approaches they can speed up inference a lot, but they must still be able to fit all of them inside the RAM.
I think it'll be decent enough for that generation.... but for cross-gen with PS7* and other systems? Yeah 30GB will be a problem. Honestly hope they up to 40GB and likewise with the handheld because that'll be good enough for a long time.

Probably a 2035 system, I suspect PS6 will last a tad longer than the PS5 because it'll be a "baseline" like the PS4 was.
 

Cheesecake16

Member
Aug 5, 2020
34
111
106

MrMPFR

Member
Aug 9, 2025
139
278
96
And that won't lead to entire new class of stutterfests?

All this neutral texture compression crap is a very slippery slope, it's bad enough already with upscaling which at least can be turned off or kept at "quality" level on high end PCs.


They'll just add x10 more cosmetics for micro-transactions and it will be back up to 200 GB in no time. Plus games are pretty expensive these days, so paying $80 and downloading whopping 200 GB makes it feel reassuringly expensive, and if it's 50 GB then it's what: small game = rip off?
Unlikely and I would prefer NTC compression over the crappy blocky artifacts BCn produces.

NTC is not DLSS. Deterministic. Think of it as a ML based data encoder/decoder.

In a world where Sony reduces the PS5 Slim to 825GB I'll take any GB saving I can get.

The Switch 2 doesn't even have enough tensor throughput for running DLSS in a bunch of games, let alone doing neural texture generation to save RAM.
Though it's not like the Switch 2 is missing RAM for its specs, to be honest.



It also doesn't look like NTC is coming to a commercial game on time to be useful for the current crop of >$200 8GB GPUs (and almost all laptop dGPUs).
On feedback has no ML inference. All they have to do is use sampler feedback to selectively transcode only some of the NTC textures. The overhead vs on-load could be smaller due to less NTC -> BCn transcoding.

Yeah this tech is prob post Rubin launch.


Edit: As prev mentioned prev here's the answers from the NTC dev:

Q1: The NTC v.0.8.0 optimizations from 2 latent arrays to a single BGRA4 texture array are not leveraged in this demo sample right?
A1: The video doesn't explicitly state which version was used, but I assume it was the latest - 0.8.0. In that case, yes, the BGRA4 latent textures are used here.
- NOTE: v.0.8.0 was confirmed by Compusemble

Q2: Do you have any rough numbers to how big the performance gains are for NTC on load and NTC on sample vs v.0.7.2?
A2: The perf gains are 10-20% for On Load, 20-50%+ for On Sample, depending on the GPU

Q3: I've seen people claim NTC on load has a higher VRAM footprint than BCn. I assume this isn't true but can you please confirm?
A3: NTC on load temporarily requires more memory: while the transcoding is done, we need to store the NTC data, the raw color data, and the BC versions of textures. After that, everything except BC can be freed or reused.

Q4: Do you use the latest version of RTXTF that includes CTF in the demo in the video? The technique is from the Collaborative Texture Filtering 2025 paper by Wronsky et al.
A4: No, CTF is not integrated here yet.

Will be interested in seeing what CTF can do about the ugly filtering and an actual NTC game implementation, but prob not happening anytime soon.
 
Last edited: