Question Speculation: RDNA2 + CDNA Architectures thread

uzzi38 · Apr 28, 2020

All die sizes are within 5mm^2. The poster here has been right on some things in the past afaik, and to his credit was the first to saying 505mm^2 for Navi21, which other people have backed up. Even still though, take the following with a pich of salt.

Navi21 - 505mm^2

Navi22 - 340mm^2

Navi23 - 240mm^2

Source is the following post: https://www.ptt.cc/bbs/PC_Shopping/M.1588075782.A.C1E.html

Dribble · Nov 24, 2020

kurosaki said:
Dlss is a tensor infused hack and in no sense the way forward. Going down in iq can never be the aim of the goal. But you were skimming the surface of something launching with DX 12.1! VRS, portions of the screen gets shaded way less. And you can pinpoint places on screen not needing per pixel shading and apply / not apply the same process on four pixels instead of one.
Upscaling is dead from the start though. I hope Nvidia quits this nonsense dinner than later.

You know that all rendering is "a hack". It's all built around short cuts and approximations. The whole aim is to get the best looking result for a silicon/power/money budget. AI up scaling is the latest and greatest "hack" for doing that and as we have seen in games like Control, Wolfenstein and Minecraft it can work really well.

PhoBoChai · Nov 24, 2020

Martimus said:
AMD just released a few videos that show how their ray tracing implementation works:

So, quite different, and optimized for DXR 1.1, like many ppl thought.

soresu · Nov 25, 2020

JoeRambo said:
I really hope so, mindshare for CUDA did not start with some kiddo running on $4000 Quadro or Tesla, it started on some lowly Kepler or Fermi card.

Much further back than that.

CUDA began before even Fermi - I remember having support for it on my 9600 GT, the last nVidia card I ever bought in 2008.

Albeit it probably didn't make any significant headway until years later.

soresu · Nov 25, 2020

PhoBoChai said:
So, quite different, and optimized for DXR 1.1, like many ppl thought.

Makes me think that DXR 1.1 was the RT target that MS and AMD built towards for XSX from the get go.

While DXR 1.0 was probably designed later to fully support nVidia Turing RT at the time nVidia was pushing it, but would also be compatible with RDNA2 when it landed later.

That way MS would know that any software made to push Turing for nVidia would probably run on XSX without massive rewrites of the RT render backends.

Speaking of that - when Epic gave the semi deep dive on UE5 Lumen and Nanite they said that Lumen currently lacked any fixed hardware based acceleration for the tracing techniques it used.

I wonder if they have made any headway with that, and what if anything that will do for resolutions when running UE5 and Lumen, considering that they said that the demo was only running at 1440p on the PS5 (still great though I might add).

Leeea · Nov 25, 2020

PhoBoChai said:
So, quite different, and optimized for DXR 1.1, like many ppl thought.

Thank you, I found that to be useful information.

Saylick · Nov 25, 2020

soresu said:
Much further back than that.

CUDA began before even Fermi - I remember having support for it on my 9600 GT, the last nVidia card I ever bought in 2008.

Albeit it probably didn't make any significant headway until years later.

Yep, CUDA started with G80, or the OG Tesla card, when Nvidia made the transition to a unified shader model, and they realized, "Oh, we have a wide array of ALUs that can be repurposed to do general purpose compute tasks. What if we created a programming language to take advantage of that?"

Qwertilot · Nov 25, 2020

Mopetar said:
I suppose it's necessary to distinguish between DLSS as a general idea and DLSS the Nvidia specific implementation.

You probably can implement it without specialized hardware, just like you could implement ray tracing without any special hardware or even generate computer graphics without using an actual GPU. It does raise a question of how much performance suffers without hardware capable of efficiently performing some computations.

The crux of the matter is whether or not the hardware that is needed to make such a solution efficient enough to use can also be used for other purposes or whether it's so fixed function that it has no other applications. If it's the latter there's a further question as to whether it's worth the die space.

Well, unless GPU's get driven out of deep learning by truly specialised chips there'll be massive motivation to keep accellators for it in for that market. That's regardless of gaming of course.

The question of whether its worth the die space needs carefully phrasing. Isn't the performance increase from DLSS when you can use it often really very large?
(Variable but somewhere around 50%?!)

So its actually a question of how universally the technique could be made to apply, and how well the image quality holds up over a range of different game types, engines etc etc.

kurosaki · Nov 25, 2020

Shivansps said:
Who says that DLSS cant be implemented under DirectML whiout needing specialised hardware?. Anyway, AI accelerators is a thing that should be used for more than DLSS in the future.

First, i dont see anything wrong with having specialised hardware for AI acceleration, looking forward AI tech will be used in a lot more programs in the future.

Second, a DLSS alternative can be implemented under DirectML.

DLSS under Direct ML is in principle DLSS 1.0. We are never going to see

Qwertilot said:
Well, unless GPU's get driven out of deep learning by truly specialised chips there'll be massive motivation to keep accellators for it in for that market. That's regardless of gaming of course.

The question of whether its worth the die space needs carefully phrasing. Isn't the performance increase from DLSS when you can use it often really very large?
(Variable but somewhere around 50%?!)

So its actually a question of how universally the technique could be made to apply, and how well the image quality holds up over a range of different game types, engines etc etc.

The performance increase in FPS is high, but what is won in performance, get lost in quality. It's a slippery slope, taking the DLSS-route forward..

biostud · Nov 25, 2020

kurosaki said:
The performance increase in FPS is high, but what is won in performance, get lost in quality. It's a slippery slope, taking the DLSS-route forward..

You are free to use it or not. You can run high res + RT, no DLSS and get low framerates, run high res + no RT and get high framerates, run low res + RT and get high framerates or run high res + RT + DLSS and get high frame rates. It is all up to you.

Asterox · Nov 25, 2020

Glo. said:
Its extremely weird to see an AMD GPU being faster in ANY Blizzard title, than Nvidia's...

Very weird to see...

Well, now AMD has some money so........................

Meanwhile, reality is falling like rain.

Shivansps · Nov 25, 2020

moinmoin said:
Just curious since you are quite vocal about AMD putting too few CUs on its APUs: Would you prefer AMD putting logic into its APUs that can accelerate something DLSS-alike instead increasing the CU count?

That depends on the quality vs performance vs adoption rate, a DLSS-like implementation can make a 720P upscaled to 1080p to look better or equal than the same game being rendered whiout upscaling at 900p with higher fps and avoids having to use lower than native screen resolutions, thats great for a APU.

Bear with me on this, what is always important is to use the same Windows screen resolution as your game resolution, because that allows you to use borderless, and if the game does not support using a diferent renderer resolution you are forced into fullscreen if the iGPU (or GPU) cant provide enoght FPS.

So what happens right now with a APU? they run games at 720P, 900P and sometimes 1080p, but most of the games runs at 900 or 720P, as 1080P has a large fps impact on APUs. But no one has 720P or 900P screens anymore, actually, they are still around in my country but you know what i mean. So with a 1080p monitor + APU you are forced into fullscreen in most games because dropping Windows resolution below 1080P is a no-go, AMD has a hardware upscaler that upscales 720p/900p fullscreen games to your windows resolution, that not everyone knows its there and is disabled by default, but it works well, and it solves all issues of running below the native resolution on some monitors, including blurry images and weird noises.

Personally i hate using fullscreen, but thats just my opinion, so, what a DLSS-like solution for APU can do? Well if Picasso had it, you could play games at with a screen resolution of 1080p (or 1440P because if i do work in the pc as well i like having a higher resolution), using borderless, with a renderer resolution of 720p, with visual quality of 900p (or higher) and higher fps than using 900p. Thats the pros.

The cons would be... adoption rate, if it is like DLSS were we have maybe 10-15% of new games using it, its worthless, how heavy it is for the memory bandwidth vs just more raster perf, size, how big this is? And finally quality it needs to produce a image quality that looks better than one rendered at a higher resolution with less fps impact.

So i would need to know all that to be sure if its better to have that or just more raw power. At any rate, software upscalers are used in a lot of games these days and they help A LOT for low end hardware like APUs, or when you are trying to run a game at a higher resolution than you should in order to keep it borderless, DLSS works just like that only that it produces way better results. So i really dont understand the hate just because Nvidia is pushing its use in RT games.

kurosaki · Nov 25, 2020

biostud said:
You are free to use it or not. You can run high res + RT, no DLSS and get low framerates, run high res + no RT and get high framerates, run low res + RT and get high framerates or run high res + RT + DLSS and get high frame rates. It is all up to you.

If they just would have scrapped the die space for tensors and doubled the amount of RT-cores, we wouldn't have this compromise in the first place. Then we would be in a situation where we had high fps AND high image quality. Then, add tensor cores when it makes sense to incorporate the AI feature on something useful.

There is an old saying that describes this very well: "To go around the brook after water"...

Mopetar · Nov 25, 2020

Qwertilot said:
Well, unless GPU's get driven out of deep learning by truly specialised chips there'll be massive motivation to keep accellators for it in for that market. That's regardless of gaming of course.

To some degree they already have been. The ML work is being done by things like tensor cores that are specialized for that kind of work. You could do it with shaders, but they're just not built for it.

It's just a happy accident that the companies that will make that kind of hardware were in the GPU business and realized that the general purpose, but massively parallel hardware they previously had for such tasks could easily be beat by specialized hardware.

The ultimate dilemma is that the workloads are vastly different and if you could by a piece of silicon that's essentially all tensor cores you'd do that, because the shaders are just wasted space if all you want is a lot of INT8 performance.

As the market grows we'll eventually see the the hardware lines move apart. It may not make sense to have a separate product line now, but eventually it will. Maybe GPUs will keep a small amount of that hardware around, but no more than is useful for the main purpose of graphics.

soresu · Nov 25, 2020

kurosaki said:
DLSS under Direct ML is in principle DLSS 1.0. We are never going to see

You are confusing the software/NN itself for the hardware it is running on.

I imagine that nVidia have probably wired DirectML straight into their tensor cores in the driver, rather than have DML software running on the general compute shaders in their uArch.

At least for >Volta anyways - otherwise those tensor cores will just be sitting idle when DML is running, giving credence to the gamers that complain about them taking up space on the die.

kurosaki · Nov 25, 2020

soresu said:
You are confusing the software/NN itself for the hardware it is running on.

I imagine that nVidia have probably wired DirectML straight into their tensor cores in the driver, rather than have DML software running on the general compute shaders in their uArch.

At least for >Volta anyways - otherwise those tensor cores will just be sitting idle when DML is running, giving credence to the gamers that complain about them taking up space on the die.

Oh but the do. They do just sit there and take up space for no use. just throw em out already. 😀
I'm not keen on paying 200 usd extra for a card just because it has tensor cores, trying to mitigate the low RT-performance in the expense on worse IQ, which is a direct cause of having the Tensor cores there in the first place. Tensor cores and DLSS is just a bad cirkle jerk.

moinmoin · Nov 25, 2020

thigobr said:
Smart Access Memory isn't new or exclusive to Zen3+500 chipsets... And it's an open spec.

Source:
https://www.reddit.com/r/Amd/comments/jk76u9

Did you guys follow the whole SAM sage since?

Nvidia picks up the tech:

https://twitter.com/x/status/1327006795253084161

Ryzen team to work to enable SAM for Nvidia GPUs, and RTG for Intel CPUs:
https://youtu.be/uaxnvRUeqkg?t=2085

400 series boards get SAM support as well, so far ASRock and MSI have respective BIOS updates:

AMD Smart Access Memory: BIOS-Update schaltet Funktion auch für X470 und B450 frei

Mit den jüngsten BIOS-Updates und AGESA v2 1.1.0.0 wird neben den Zen-3-CPUs auch der Smart Access Memory für X470 und B450 freigeschaltet.

www.computerbase.de

That's a pretty speedy spread for a tech originally announced exclusive for RDNA2/Zen 3/500 series boards less than a month ago.

Still limited to RDNA2 and Zen 3 so far, but apparently that's only down to validation?

Stuka87 · Nov 25, 2020

moinmoin said:
Did you guys follow the whole SAM sage since?

Nvidia picks up the tech:

https://twitter.com/x/status/1327006795253084161

Ryzen team to work to enable SAM for Nvidia GPUs, and RTG for Intel CPUs:

400 series boards get SAM support as well, so far ASRock and MSI have respective BIOS updates:

AMD Smart Access Memory: BIOS-Update schaltet Funktion auch für X470 und B450 frei

Mit den jüngsten BIOS-Updates und AGESA v2 1.1.0.0 wird neben den Zen-3-CPUs auch der Smart Access Memory für X470 und B450 freigeschaltet.

www.computerbase.de

That's a pretty speedy spread for a tech originally announced exclusive for RDNA2/Zen 3/500 series boards less than a month ago.

Still limited to RDNA2 and Zen 3 so far, but apparently that's only down to validation?

Yeah, it will be available on more platforms as software becomes available for them. But was a good talking point for AMD to announce it first on their latest and greatest.

kurosaki · Nov 25, 2020

Stuka87 said:
Yeah, it will be available on more platforms as software becomes available for them. But was a good talking point for AMD to announce it first on their latest and greatest.

... not yet released cards...

Stuka87 · Nov 25, 2020

kurosaki said:
... not yet released cards...

They may be hard to get, but they are released.

kurosaki · Nov 26, 2020

Stuka87 said:
They may be hard to get, but they are released.

At least on paper

beginner99 · Nov 26, 2020

kurosaki said:
At least on paper

Gosh just checked local shop, no model has any availabilty date and the few that have a price are priced like there is no tomorrow.

RX 6800 with MSRP of $579 currency adjust is listed around $800
RX 6800 XT with MSRP of $649 currency adjust is listed around $980 (some models over $1000)

I'm aware with sales tax etc prices are a bit higher than MSRP but for the Ryzen 3900x I got last year that was $50 and most of it due to lack of supply and not $250...

Eg. retailers are gouging as well

EDIT: And with RTX3080 / 3070 the difference to MSRP is smaller making the 3080 actually cheaper than a 6800xt here...

Qwertilot · Nov 26, 2020

Mopetar said:
To some degree they already have been. The ML work is being done by things like tensor cores that are specialized for that kind of work. You could do it with shaders, but they're just not built for it.

It's just a happy accident that the companies that will make that kind of hardware were in the GPU business and realized that the general purpose, but massively parallel hardware they previously had for such tasks could easily be beat by specialized hardware.

The ultimate dilemma is that the workloads are vastly different and if you could by a piece of silicon that's essentially all tensor cores you'd do that, because the shaders are just wasted space if all you want is a lot of INT8 performance.

I guess it must be a little more complex than this, otherwise the specialised companies would do just that and wipe NV's performance out 🙂

Mopetar said:
As the market grows we'll eventually see the the hardware lines move apart. It may not make sense to have a separate product line now, but eventually it will. Maybe GPUs will keep a small amount of that hardware around, but no more than is useful for the main purpose of graphics.

Quite possible of course. On a related, if formally rather off topic note, has anyone seen any neural net execution benchmarks for the NPU in the M1 (or A14) vs various NV GPU's? You wouldn't use it for training, I guess.

Mopetar · Nov 26, 2020

Qwertilot said:
I guess it must be a little more complex than this, otherwise the specialised companies would do just that and wipe NV's performance out 🙂

Unless there's a strong financial incentive for them to invest in said market, they might have better things to do with their time. Notice that GPUs were essentially replaced in bitcoin mining. Eventually the market became large enough that someone could justify the cost of developing, producing, marketing, etc. specialized hardware to do a job that GPUs were tasked with doing.

It also isn't as though Nvidia would get removed from the market. They would be able to sell cards that perform those same specific functions without the extra GPU bits bolted on that aren't required. The whole point is that for anyone who wants a lot of ML capabilities, they buy an ML card that's loaded with silicon dedicated to that task. There will eventually come a time where the market demand for this kind of hardware grows large enough to justify such cards.

AmericanLocomotive · Nov 26, 2020

Saylick said:
Yep, CUDA started with G80, or the OG Tesla card, when Nvidia made the transition to a unified shader model, and they realized, "Oh, we have a wide array of ALUs that can be repurposed to do general purpose compute tasks. What if we created a programming language to take advantage of that?"

AMD actually had a "GPGPU" solution implemented on their R580 cards well before CUDA. There were people using Radeon X1900s to do Folding@Home, and for a while AMD was the "GPGPU" darling. But they somehow squandered that lead.

Dribble · Nov 27, 2020

Qwertilot said:
I guess it must be a little more complex than this, otherwise the specialised companies would do just that and wipe NV's performance out 🙂

If CUDA is anything to go buy then they will sell because Nvidia produce the software libraries to go with the hardware.

Question Speculation: RDNA2 + CDNA Architectures thread

Platinum Member

Platinum Member

Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Golden Member

Senior member

Lifer

Golden Member

Diamond Member

Senior member

Diamond Member

Diamond Member

Senior member

Diamond Member

Diamond Member

Senior member

Diamond Member

Senior member

Diamond Member

Golden Member

Diamond Member

Member

Platinum Member