Question Speculation: RDNA2 + CDNA Architectures thread

Page 196 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

uzzi38

Platinum Member
Oct 16, 2019
2,634
5,961
146
All die sizes are within 5mm^2. The poster here has been right on some things in the past afaik, and to his credit was the first to saying 505mm^2 for Navi21, which other people have backed up. Even still though, take the following with a pich of salt.

Navi21 - 505mm^2

Navi22 - 340mm^2

Navi23 - 240mm^2

Source is the following post: https://www.ptt.cc/bbs/PC_Shopping/M.1588075782.A.C1E.html
 

Dribble

Platinum Member
Aug 9, 2005
2,076
611
136
Dlss is a tensor infused hack and in no sense the way forward. Going down in iq can never be the aim of the goal. But you were skimming the surface of something launching with DX 12.1! VRS, portions of the screen gets shaded way less. And you can pinpoint places on screen not needing per pixel shading and apply / not apply the same process on four pixels instead of one.
Upscaling is dead from the start though. I hope Nvidia quits this nonsense dinner than later.
You know that all rendering is "a hack". It's all built around short cuts and approximations. The whole aim is to get the best looking result for a silicon/power/money budget. AI up scaling is the latest and greatest "hack" for doing that and as we have seen in games like Control, Wolfenstein and Minecraft it can work really well.
 

soresu

Platinum Member
Dec 19, 2014
2,662
1,862
136
I really hope so, mindshare for CUDA did not start with some kiddo running on $4000 Quadro or Tesla, it started on some lowly Kepler or Fermi card.
Much further back than that.

CUDA began before even Fermi - I remember having support for it on my 9600 GT, the last nVidia card I ever bought in 2008.

Albeit it probably didn't make any significant headway until years later.
 
  • Like
Reactions: Tlh97 and Saylick

soresu

Platinum Member
Dec 19, 2014
2,662
1,862
136
So, quite different, and optimized for DXR 1.1, like many ppl thought.
Makes me think that DXR 1.1 was the RT target that MS and AMD built towards for XSX from the get go.

While DXR 1.0 was probably designed later to fully support nVidia Turing RT at the time nVidia was pushing it, but would also be compatible with RDNA2 when it landed later.

That way MS would know that any software made to push Turing for nVidia would probably run on XSX without massive rewrites of the RT render backends.

Speaking of that - when Epic gave the semi deep dive on UE5 Lumen and Nanite they said that Lumen currently lacked any fixed hardware based acceleration for the tracing techniques it used.

I wonder if they have made any headway with that, and what if anything that will do for resolutions when running UE5 and Lumen, considering that they said that the demo was only running at 1440p on the PS5 (still great though I might add).
 
Last edited:

Saylick

Diamond Member
Sep 10, 2012
3,162
6,388
136
Much further back than that.

CUDA began before even Fermi - I remember having support for it on my 9600 GT, the last nVidia card I ever bought in 2008.

Albeit it probably didn't make any significant headway until years later.
Yep, CUDA started with G80, or the OG Tesla card, when Nvidia made the transition to a unified shader model, and they realized, "Oh, we have a wide array of ALUs that can be repurposed to do general purpose compute tasks. What if we created a programming language to take advantage of that?"
 

Qwertilot

Golden Member
Nov 28, 2013
1,604
257
126
I suppose it's necessary to distinguish between DLSS as a general idea and DLSS the Nvidia specific implementation.

You probably can implement it without specialized hardware, just like you could implement ray tracing without any special hardware or even generate computer graphics without using an actual GPU. It does raise a question of how much performance suffers without hardware capable of efficiently performing some computations.

The crux of the matter is whether or not the hardware that is needed to make such a solution efficient enough to use can also be used for other purposes or whether it's so fixed function that it has no other applications. If it's the latter there's a further question as to whether it's worth the die space.

Well, unless GPU's get driven out of deep learning by truly specialised chips there'll be massive motivation to keep accellators for it in for that market. That's regardless of gaming of course.

The question of whether its worth the die space needs carefully phrasing. Isn't the performance increase from DLSS when you can use it often really very large?
(Variable but somewhere around 50%?!)

So its actually a question of how universally the technique could be made to apply, and how well the image quality holds up over a range of different game types, engines etc etc.
 

kurosaki

Senior member
Feb 7, 2019
258
250
86
Who says that DLSS cant be implemented under DirectML whiout needing specialised hardware?. Anyway, AI accelerators is a thing that should be used for more than DLSS in the future.



First, i dont see anything wrong with having specialised hardware for AI acceleration, looking forward AI tech will be used in a lot more programs in the future.

Second, a DLSS alternative can be implemented under DirectML.
DLSS under Direct ML is in principle DLSS 1.0. We are never going to see
Well, unless GPU's get driven out of deep learning by truly specialised chips there'll be massive motivation to keep accellators for it in for that market. That's regardless of gaming of course.

The question of whether its worth the die space needs carefully phrasing. Isn't the performance increase from DLSS when you can use it often really very large?
(Variable but somewhere around 50%?!)

So its actually a question of how universally the technique could be made to apply, and how well the image quality holds up over a range of different game types, engines etc etc.
The performance increase in FPS is high, but what is won in performance, get lost in quality. It's a slippery slope, taking the DLSS-route forward..
 

biostud

Lifer
Feb 27, 2003
18,251
4,764
136
The performance increase in FPS is high, but what is won in performance, get lost in quality. It's a slippery slope, taking the DLSS-route forward..

You are free to use it or not. You can run high res + RT, no DLSS and get low framerates, run high res + no RT and get high framerates, run low res + RT and get high framerates or run high res + RT + DLSS and get high frame rates. It is all up to you.
 
  • Like
Reactions: SMU_Pony and prtskg

Shivansps

Diamond Member
Sep 11, 2013
3,855
1,518
136
Just curious since you are quite vocal about AMD putting too few CUs on its APUs: Would you prefer AMD putting logic into its APUs that can accelerate something DLSS-alike instead increasing the CU count?

That depends on the quality vs performance vs adoption rate, a DLSS-like implementation can make a 720P upscaled to 1080p to look better or equal than the same game being rendered whiout upscaling at 900p with higher fps and avoids having to use lower than native screen resolutions, thats great for a APU.

Bear with me on this, what is always important is to use the same Windows screen resolution as your game resolution, because that allows you to use borderless, and if the game does not support using a diferent renderer resolution you are forced into fullscreen if the iGPU (or GPU) cant provide enoght FPS.

So what happens right now with a APU? they run games at 720P, 900P and sometimes 1080p, but most of the games runs at 900 or 720P, as 1080P has a large fps impact on APUs. But no one has 720P or 900P screens anymore, actually, they are still around in my country but you know what i mean. So with a 1080p monitor + APU you are forced into fullscreen in most games because dropping Windows resolution below 1080P is a no-go, AMD has a hardware upscaler that upscales 720p/900p fullscreen games to your windows resolution, that not everyone knows its there and is disabled by default, but it works well, and it solves all issues of running below the native resolution on some monitors, including blurry images and weird noises.

Personally i hate using fullscreen, but thats just my opinion, so, what a DLSS-like solution for APU can do? Well if Picasso had it, you could play games at with a screen resolution of 1080p (or 1440P because if i do work in the pc as well i like having a higher resolution), using borderless, with a renderer resolution of 720p, with visual quality of 900p (or higher) and higher fps than using 900p. Thats the pros.

The cons would be... adoption rate, if it is like DLSS were we have maybe 10-15% of new games using it, its worthless, how heavy it is for the memory bandwidth vs just more raster perf, size, how big this is? And finally quality it needs to produce a image quality that looks better than one rendered at a higher resolution with less fps impact.

So i would need to know all that to be sure if its better to have that or just more raw power. At any rate, software upscalers are used in a lot of games these days and they help A LOT for low end hardware like APUs, or when you are trying to run a game at a higher resolution than you should in order to keep it borderless, DLSS works just like that only that it produces way better results. So i really dont understand the hate just because Nvidia is pushing its use in RT games.
 

kurosaki

Senior member
Feb 7, 2019
258
250
86
You are free to use it or not. You can run high res + RT, no DLSS and get low framerates, run high res + no RT and get high framerates, run low res + RT and get high framerates or run high res + RT + DLSS and get high frame rates. It is all up to you.
If they just would have scrapped the die space for tensors and doubled the amount of RT-cores, we wouldn't have this compromise in the first place. Then we would be in a situation where we had high fps AND high image quality. Then, add tensor cores when it makes sense to incorporate the AI feature on something useful.

There is an old saying that describes this very well: "To go around the brook after water"...
 
Last edited:

Mopetar

Diamond Member
Jan 31, 2011
7,837
5,992
136
Well, unless GPU's get driven out of deep learning by truly specialised chips there'll be massive motivation to keep accellators for it in for that market. That's regardless of gaming of course.

To some degree they already have been. The ML work is being done by things like tensor cores that are specialized for that kind of work. You could do it with shaders, but they're just not built for it.

It's just a happy accident that the companies that will make that kind of hardware were in the GPU business and realized that the general purpose, but massively parallel hardware they previously had for such tasks could easily be beat by specialized hardware.

The ultimate dilemma is that the workloads are vastly different and if you could by a piece of silicon that's essentially all tensor cores you'd do that, because the shaders are just wasted space if all you want is a lot of INT8 performance.

As the market grows we'll eventually see the the hardware lines move apart. It may not make sense to have a separate product line now, but eventually it will. Maybe GPUs will keep a small amount of that hardware around, but no more than is useful for the main purpose of graphics.
 

soresu

Platinum Member
Dec 19, 2014
2,662
1,862
136
DLSS under Direct ML is in principle DLSS 1.0. We are never going to see
You are confusing the software/NN itself for the hardware it is running on.

I imagine that nVidia have probably wired DirectML straight into their tensor cores in the driver, rather than have DML software running on the general compute shaders in their uArch.

At least for >Volta anyways - otherwise those tensor cores will just be sitting idle when DML is running, giving credence to the gamers that complain about them taking up space on the die.
 

kurosaki

Senior member
Feb 7, 2019
258
250
86
You are confusing the software/NN itself for the hardware it is running on.

I imagine that nVidia have probably wired DirectML straight into their tensor cores in the driver, rather than have DML software running on the general compute shaders in their uArch.

At least for >Volta anyways - otherwise those tensor cores will just be sitting idle when DML is running, giving credence to the gamers that complain about them taking up space on the die.
Oh but the do. They do just sit there and take up space for no use. just throw em out already. :D
I'm not keen on paying 200 usd extra for a card just because it has tensor cores, trying to mitigate the low RT-performance in the expense on worse IQ, which is a direct cause of having the Tensor cores there in the first place. Tensor cores and DLSS is just a bad cirkle jerk.
 

moinmoin

Diamond Member
Jun 1, 2017
4,952
7,663
136
Smart Access Memory isn't new or exclusive to Zen3+500 chipsets... And it's an open spec.

Source:
Did you guys follow the whole SAM sage since?

Nvidia picks up the tech:

Ryzen team to work to enable SAM for Nvidia GPUs, and RTG for Intel CPUs:
https://youtu.be/uaxnvRUeqkg?t=2085

400 series boards get SAM support as well, so far ASRock and MSI have respective BIOS updates:

That's a pretty speedy spread for a tech originally announced exclusive for RDNA2/Zen 3/500 series boards less than a month ago.

Still limited to RDNA2 and Zen 3 so far, but apparently that's only down to validation?
 

Stuka87

Diamond Member
Dec 10, 2010
6,240
2,559
136
Did you guys follow the whole SAM sage since?

Nvidia picks up the tech:

Ryzen team to work to enable SAM for Nvidia GPUs, and RTG for Intel CPUs:

400 series boards get SAM support as well, so far ASRock and MSI have respective BIOS updates:

That's a pretty speedy spread for a tech originally announced exclusive for RDNA2/Zen 3/500 series boards less than a month ago.

Still limited to RDNA2 and Zen 3 so far, but apparently that's only down to validation?

Yeah, it will be available on more platforms as software becomes available for them. But was a good talking point for AMD to announce it first on their latest and greatest.
 

beginner99

Diamond Member
Jun 2, 2009
5,210
1,580
136
At least on paper

Gosh just checked local shop, no model has any availabilty date and the few that have a price are priced like there is no tomorrow.

RX 6800 with MSRP of $579 currency adjust is listed around $800
RX 6800 XT with MSRP of $649 currency adjust is listed around $980 (some models over $1000)

I'm aware with sales tax etc prices are a bit higher than MSRP but for the Ryzen 3900x I got last year that was $50 and most of it due to lack of supply and not $250...

Eg. retailers are gouging as well

EDIT: And with RTX3080 / 3070 the difference to MSRP is smaller making the 3080 actually cheaper than a 6800xt here...
 
Last edited:

Qwertilot

Golden Member
Nov 28, 2013
1,604
257
126
To some degree they already have been. The ML work is being done by things like tensor cores that are specialized for that kind of work. You could do it with shaders, but they're just not built for it.

It's just a happy accident that the companies that will make that kind of hardware were in the GPU business and realized that the general purpose, but massively parallel hardware they previously had for such tasks could easily be beat by specialized hardware.

The ultimate dilemma is that the workloads are vastly different and if you could by a piece of silicon that's essentially all tensor cores you'd do that, because the shaders are just wasted space if all you want is a lot of INT8 performance.

I guess it must be a little more complex than this, otherwise the specialised companies would do just that and wipe NV's performance out :)

As the market grows we'll eventually see the the hardware lines move apart. It may not make sense to have a separate product line now, but eventually it will. Maybe GPUs will keep a small amount of that hardware around, but no more than is useful for the main purpose of graphics.

Quite possible of course. On a related, if formally rather off topic note, has anyone seen any neural net execution benchmarks for the NPU in the M1 (or A14) vs various NV GPU's? You wouldn't use it for training, I guess.
 

Mopetar

Diamond Member
Jan 31, 2011
7,837
5,992
136
I guess it must be a little more complex than this, otherwise the specialised companies would do just that and wipe NV's performance out :)

Unless there's a strong financial incentive for them to invest in said market, they might have better things to do with their time. Notice that GPUs were essentially replaced in bitcoin mining. Eventually the market became large enough that someone could justify the cost of developing, producing, marketing, etc. specialized hardware to do a job that GPUs were tasked with doing.

It also isn't as though Nvidia would get removed from the market. They would be able to sell cards that perform those same specific functions without the extra GPU bits bolted on that aren't required. The whole point is that for anyone who wants a lot of ML capabilities, they buy an ML card that's loaded with silicon dedicated to that task. There will eventually come a time where the market demand for this kind of hardware grows large enough to justify such cards.
 
Apr 30, 2020
68
170
76
Yep, CUDA started with G80, or the OG Tesla card, when Nvidia made the transition to a unified shader model, and they realized, "Oh, we have a wide array of ALUs that can be repurposed to do general purpose compute tasks. What if we created a programming language to take advantage of that?"
AMD actually had a "GPGPU" solution implemented on their R580 cards well before CUDA. There were people using Radeon X1900s to do Folding@Home, and for a while AMD was the "GPGPU" darling. But they somehow squandered that lead.
 
  • Like
Reactions: Saylick