Discussion RDNA 5 / UDNA (CDNA Next) speculation

Page 22 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

poke01

Diamond Member
Mar 8, 2022
3,781
5,136
106
They need to double it again on per CU level, so more with more CUs. Nvidia got sloppy with Blackwell, but this is unlikely to continue, AMD must catch up proper next time, ideally be better.
Nvidia always strikes back hard and unexpectedly
 

GodisanAtheist

Diamond Member
Nov 16, 2006
8,125
9,385
136
If they *always* strike back hard and also you expect it how is it unexpectedly?
🤔
Really what was unexpected is nearly complete lack of perf per watt improvement of Blackwell.

-They always strike back but their method of striking back is unexpected, was my takeaway.

Compete on performance and price across the stack?

Maybe they'll swallow a lot of margin to compete on price.

Or maybe NV will compete by leveraging their market dominance and strongarming AIBs into spinning off crappy b-tier product lines for you.

Or maybe they'll strong arm devs into "hidden tesselation" or "extreme RT" or whatever software feature gives them an edge.

Or maybe it will be a marketing subterfuge campaign where suddenly a bunch of old accounts with 1 post each wake up and begin complaining about crappy drivers.

Or...
 

Makaveli

Diamond Member
Feb 8, 2002
4,966
1,561
136
In raster, still hopefully RDNA5 makes everything a moot point.
TPU has it 31% ahead which is why I added a bit of safety margin.
My percentage was also based on TPU data but from a while back now when I double check it I see the 31% there so the gain looks to have increased in like the last year and abit.
 

Jan Olšan

Senior member
Jan 12, 2017
542
1,077
136
-They always strike back but their method of striking back is unexpected, was my takeaway.

Compete on performance and price across the stack?

Maybe they'll swallow a lot of margin to compete on price.

Or maybe NV will compete by leveraging their market dominance and strongarming AIBs into spinning off crappy b-tier product lines for you.

Or maybe they'll strong arm devs into "hidden tesselation" or "extreme RT" or whatever software feature gives them an edge.

Or maybe it will be a marketing subterfuge campaign where suddenly a bunch of old accounts with 1 post each wake up and begin complaining about crappy drivers.

Or...
Or run guerilla marketing that floods reddits with notion that AMD card had to be unrealistically cheaper to consider and because it isn't... recite a "the more you buy the more you save" prayer and now their more expensive card is actually cheaper or something.

Either Nvidia spends a lot of effort to spam that narrative or they succeeded and the customers actually believe and parrot it for them, because I see it this BS all the time.
 

soresu

Diamond Member
Dec 19, 2014
3,899
3,331
136
Gaming brought in less than Nvidia's R&D costs
Realistically though that doesn't really matter to them.

AI/ML is making them so much money at the moment that the gaming market could crash and they would probably be happier to just dedicate that wafer capacity to pro/dc instead.
 

DAPUNISHER

Super Moderator CPU Forum Mod and Elite Member
Super Moderator
Aug 22, 2001
31,705
31,602
146
Realistically though that doesn't really matter to them.

AI/ML is making them so much money at the moment that the gaming market could crash and they would probably be happier to just dedicate that wafer capacity to pro/dc instead.
LOL, yeah man, that was my point.
 
  • Like
Reactions: Tlh97 and soresu

reaperrr3

Member
May 31, 2024
107
328
96
Lets see, N48 has roughly 25% more gaming perf/FLOP than N31.
RDNA5 can be expected to increase that more, let's say 35%.
I wouldn't be so optimistic about (raster) perf/FLOP, for 2 reasons:

1) RDNA4 was such a huge uplift and fixed so many weaknesses of RDNA3, that RDNA5 may still hit a lower real-world improvement per FLOP in raster, despite bigger changes on paper.

2) If some of the "IPC" improvement comes from considerable VOPD/dual-issue improvements, AMD will likely advertise dual-issue FLOPs again, and perf/FLOP will technically go down ;)
(perf/WGP @ same clock would still be up considerably, of course)

N31 config with >3GHz clocks, GDDR7 and higher IPC than RDNA4 sounds like a beast that could bite a GB202 in its heels, to be honest.
Yeah.

Assuming the 64CU N5x will be at least N48 x1.3 in raster, a 96CU N5x would need to reach at least N48 x1.7-x1.8 to make sense, which would put it virtually on par with the 5090 in most games.
 
  • Like
Reactions: Tlh97

basix

Member
Oct 4, 2024
155
308
96
AMD is already advertising dual-issue FLOPS. They just do not do that on stream processor count level.

N31 has 6144 dual-issue stream processors but 12'288 FP32 units. The peak FLOPS throughput is based on the FP32 unit count.
 

Kepler_L2

Senior member
Sep 6, 2020
899
3,683
136
AMD is already advertising dual-issue FLOPS. They just do not do that on stream processor count level.

N31 has 6144 dual-issue stream processors but 12'288 FP32 units. The peak FLOPS throughput is based on the FP32 unit count.
Yeah, I do expect them to change the advertised core count with RDNA5, since VOPD will go from best-case scenario to average-case-unless-something-weird-happened scenario.
 

basix

Member
Oct 4, 2024
155
308
96
It will be far from 2x. But if we see something similar like Turing to Ampere, I would be impressed. Basically 1.3x or so performance per FLOPS with relatively little additional HW. The FP-Units are already there.

And who knows, what dynamic / OoO execution and updated caching systems bring to the table in addition to VOPD.
 

dangerman1337

Senior member
Sep 16, 2010
348
11
81
From the use of 384-bit bus in Magnus & 96 CU UDNA die, I suspect that AMD is maybe gutting Infinity Cache altogether from their cards and just having a lot of bandwith.
 

Mopetar

Diamond Member
Jan 31, 2011
8,438
7,634
136
They need to double it again on per CU level, so more with more CUs. Nvidia got sloppy with Blackwell, but this is unlikely to continue, AMD must catch up proper next time, ideally be better.

Nvidia is too busy making truckloads of money and being the most valuable company of all time to care about consumer GPUs. The AI market doesn't seem to be dying down any so I don't see why Nvidia would devote time and effort to such a tiny part of their bottom line.

If the AI market dies they can change their tune then and everyone will gladly welcome them back. Until then they can just sell cutdown versions of massive dies made for different markets. Anyone talented enough to work on a killer gaming GPU can make an even more profitable GPU for other markets.
 

dangerman1337

Senior member
Sep 16, 2010
348
11
81
Nvidia is too busy making truckloads of money and being the most valuable company of all time to care about consumer GPUs. The AI market doesn't seem to be dying down any so I don't see why Nvidia would devote time and effort to such a tiny part of their bottom line.

If the AI market dies they can change their tune then and everyone will gladly welcome them back. Until then they can just sell cutdown versions of massive dies made for different markets. Anyone talented enough to work on a killer gaming GPU can make an even more profitable GPU for other markets.
Besides Nvidia is aiming for robotics with AI/ML being the stepping stone towards that. Jensen is playing the long game.
 

ToTTenTranz

Senior member
Feb 4, 2021
463
853
136
Too bad real-world scenarios likely won’t be able to net you 2x performance even with 2x effective compute.
No one suggested 2x performance nor even remotely similar to that, and putting those words into others' doesn't seem very honest IMO.


Even if "real performance" per-CU and per-clock increases only 15% it's already a big difference if coming from a single additional feature that usually takes a rather small increase in transistor count (less than simply adding 15% more execution units and caches).
 

maddie

Diamond Member
Jul 18, 2010
5,150
5,529
136
From the use of 384-bit bus in Magnus & 96 CU UDNA die, I suspect that AMD is maybe gutting Infinity Cache altogether from their cards and just having a lot of bandwith.
Nah. The bus/CU ratio is the same as RDNA4. They will probably need even better caching schemes to handle the expected improvement in instruction throughput execution rate.