Question Speculation: RDNA2 + CDNA Architectures thread

uzzi38 · Apr 28, 2020

All die sizes are within 5mm^2. The poster here has been right on some things in the past afaik, and to his credit was the first to saying 505mm^2 for Navi21, which other people have backed up. Even still though, take the following with a pich of salt.

Navi21 - 505mm^2

Navi22 - 340mm^2

Navi23 - 240mm^2

Source is the following post: https://www.ptt.cc/bbs/PC_Shopping/M.1588075782.A.C1E.html

uzzi38 · Oct 12, 2020

Guru said:
There is no way cu's would scale linearly, doubling the cu's would require a major redesign of the core to optimize feeding all the cu's and shaders, which is a very intricate and tough thing to do. Considering RTX 3080 is about 80% faster than the RX 5700xt, its pretty clear that AMD have achieved similar performance to Nvidia and about 80% scaling with double the cu's, which is quite a reasonable number.

The 3080 is literally 2x 5700XT performance.

GaiaHunter · Oct 12, 2020

uzzi38 said:
The 3080 is literally 2x 5700XT performance.

At 4K yes.
4K is where you can expect very close to linear scaling with GPU resources though.

Glo. · Oct 12, 2020

Guru said:
There is no way cu's would scale linearly, doubling the cu's would require a major redesign of the core to optimize feeding all the cu's and shaders, which is a very intricate and tough thing to do. Considering RTX 3080 is about 80% faster than the RX 5700xt, its pretty clear that AMD have achieved similar performance to Nvidia and about 80% scaling with double the cu's, which is quite a reasonable number.

It was already done with RDNA1...

RDNA2 was further redesigned to reduce the memory bandwidth dependency, and higher cache hit rates.

Both affect scaling, and utilization with higher CU counts.

Zstream · Oct 12, 2020

Guru said:
GPU's are already very scalable, each CU is essentially a CCX or a processor itself, however you want to call it. The issue comes down to feeding all the CU's and keeping them optimally used.

There is no need for chiplet design for GPU, at least not in the desktop market and similar. Super computers and such are a different story though.

That’s exactly why you’re wrong. Hopper is already going to do this. There is no code in the world that will feed CU’s at 100% efficiency at this scale. That’s why you start separating out workloads via multi-gpu. In four years time you will have 20/40 CU chiplets that have IC as the glue. I’d think we will start seeing 4x 40 CU chiplets here soon.

maddie · Oct 12, 2020

Guru said:
GPU's are already very scalable, each CU is essentially a CCX or a processor itself, however you want to call it. The issue comes down to feeding all the CU's and keeping them optimally used.

There is no need for chiplet design for GPU, at least not in the desktop market and similar. Super computers and such are a different story though.

Cost and all the other benefits that come with chiplets, unless you look forward to paying ever increasing prices as node size decrease.

kurosaki · Oct 12, 2020

Would be cool having 4 separate 40 CU-chiplets, chugging a frame each, in a cycle. If it works as intended, that would be like a mini-crossfire type of setup.

Veradun · Oct 12, 2020

kurosaki said:
Would be cool having 4 separate 40 CU-chiplets, chugging a frame each, in a cycle. If it works as intended, that would be like a mini-crossfire type of setup.

They've been capable of doing CF/SLI on a board for quite some time.

The whole point would be to put all the chiplets on the job on EACH frame.

PhonakV30 · Oct 12, 2020

Zstream said:
I think you’re wrong on the chiplet design. Imagine

Because the IPC increase isn't there.

AMD claimed RDNA2 is +50% more perf/watt

It's Impossible to tweak N7 for 50% more performance at the same watt.so How can they gain 50% more perf/watt ? also Look at the word "IPC" , This is official from AMD slides.AMD claims there is improved Perf-per-clock (IPC) over RDNA1 , unless they mean using features like VSR/DirectML or etc can improve perf.

Zstream · Oct 12, 2020

PhonakV30 said:
AMD claimed RDNA2 is +50% more perf/watt

It's Impossible to tweak N7 for 50% more performance at the same watt.so How can they gain 50% more perf/watt ? also Look at the word "IPC" , This is official from AMD slides.AMD claims there is improved Perf-per-clock (IPC) over RDNA1 , unless they mean using features like VSR/DirectML or etc can improve perf.

Yeah, I'm aware. I've seen that slide a hundred times. If the card demo'd on stage is the best RDNA2, there is absolutely no way it achieved RAW 50% improvement at the same speed.

If they are adding the cache mechanisms, some sort of AI to do branch predictions etc.. I can understand.

Kenmitch · Oct 12, 2020

Krteq said:
That's been confirmed weeks ago by AMDGPU driver

freedesktop.org - [PATCH 2/4] drm/amdgpu: add VCN 3.0 AV1 registers

VideoCardZ - AMD Navi 2X GPUs (RDNA2) to support AV1 decoding

The article lied? It did say that Microcenter had just confirmed support for it the day before. A confirmation of the support for the confirmation?

EXCellR8 · Oct 12, 2020

I am totally onboard if 2x the performance on my current Vega 64 @ 4K (or near 4K). 3700X should still get me through without a bottleneck.

X370 chipset really starting to show its age now though, let me tell ya

Mopetar · Oct 12, 2020

Veradun said:
They've been capable of doing CF/SLI on a board for quite some time.

The whole point would be to put all the chiplets on the job on EACH frame.

The real trick is making it seamless so that developers don't need to change their code to take advantage of it like was often the case with CF/SLI where some games had great support with near 100% scaling and others didn't support it at all.

If you can get that part down it's probably the biggest hurdle, because most companies don't want to have to devote resources to getting something such a small number of users run working perfectly.

If it were easy we'd have seen it done previously. I suppose companies always had some incentive not to care about it too much since it's a lot harder to sell a top-end card if you can get identical performance from two or more low end cards that are priced better.

Stuka87 · Oct 12, 2020

PhonakV30 said:
AMD claimed RDNA2 is +50% more perf/watt

It's Impossible to tweak N7 for 50% more performance at the same watt.so How can they gain 50% more perf/watt ? also Look at the word "IPC" , This is official from AMD slides.AMD claims there is improved Perf-per-clock (IPC) over RDNA1 , unless they mean using features like VSR/DirectML or etc can improve perf.

'Impossible' is never a word you should use when describing technological improvements. Its very closed minded.

There are numerous ways to improve performance per watt. Tweaking the process, or moving to a more energy efficient process is one of them. Another way is to improve the performance of the IC itself. The third common way is to improve the efficiency of the of the IC.

Lets say the big cache we have been seeing is true, and they do not require a big 384bit bus, but instead can use a much smaller 256bit bus, that saves quite a bit of energy as the GPU is no longer having to go off die to get commonly used data. It also means that they don't have to use energy hungry memory (such as GDDR6X).

Midwayman · Oct 12, 2020

If you don't give people the option of multi-card SLI and instead only have chiplets you can really ratchet up the profit margin on those high yield chips. How amazing would it be if AMD announced they cracked it? It would take Nvidia years to be able to respond. I don't think its likely, but nice to dream.

Konan · Oct 12, 2020

Midwayman said:
If you don't give people the option of multi-card SLI and instead only have chiplets you can really ratchet up the profit margin on those high yield chips. How amazing would it be if AMD announced they cracked it? It would take Nvidia years to be able to respond. I don't think its likely, but nice to dream.

They need to crack RT and a DLSS competitor first...IMO
MCM approach for both companies with 5nm incoming

Stuka87 · Oct 12, 2020

Konan said:
They need to crack RT and a DLSS competitor first...IMO
MCM approach for both companies with 5nm incoming

What do you mean they need to crack RT? RDNA2 already has it.

Konan · Oct 12, 2020

Stuka87 said:
What do you mean they need to crack RT? RDNA2 already has it.

Sorry, I mean show it, have it evaluated and it works to a competing level (or close). Not powerpoint slides saying so.

Zstream · Oct 12, 2020

Midwayman said:
If you don't give people the option of multi-card SLI and instead only have chiplets you can really ratchet up the profit margin on those high yield chips. How amazing would it be if AMD announced they cracked it? It would take Nvidia years to be able to respond. I don't think its likely, but nice to dream.

Yes, this is what I mean. With IC, you're sharing much of the data between the GPU's. It solves SFR's limitation, and is a no brainer to implement.

Not my quote, but you get the idea:

"Overdrawing is the biggest problem here; all of the vertices for scene geometry have to be transformed by each GPU even if they are not within the GPU's assigned region, meaning geometry performance cannot scale like it does with AFR, and any polygon between multiple rendering regions has to be fully textured and shaded by each GPU whose region it occupies, which is wasteful. Of course, there are also complications that can rise from inaccurate workload allocations."

If you're able to share the data between GPU's, the redundancy isn't required.

Guru · Oct 12, 2020

uzzi38 said:
The 3080 is literally 2x 5700XT performance.

So? Its a different architecture! There is no way 80 cu's scale linearry.

Glo. · Oct 12, 2020

Glo. said:
RTX 2080 Ti - 50% faster than RX 5700 XT at 4K, mainly due to VRAM buffer limit on RX 5700 XT.

Navi 21 has 80 CUs, 256 bit memory bus, 2.2 GHz clock speeds, at the very least

2.2 GHz is 16% above what RX 5700 XT clocked. And RDNA2 GPU are supposed to have higher IPC, than RDNA1.

Solely on CU counts, and VRAM size buffer, Navi 21 should achieve 100% performance above RX 5700 XT.

And that is excluding IPC and clock speed differences.

And yet, AMD demoed a GPU that is 70% faster at 4K than RX 5700 XT.

Something does not add up.

A little follow up.

If RDNA2 achieved 10% IPC increase - 80 RDNA2 CUs at 1.8 GHz will perform just like 88 RDNA1 CUs clocked at 1.8 GHz.

2.2 GHz clock speeds are 16% above RX 5700 XT clock speeds(1887 MHz).

In essence, if AMD has found a way, that 256 bit memory bus is enough for 80 RDNA2 CUs that GPU should be, with full config, with 2.2 GHz around 135% above RX 5700 XT.

Around 10-15% above RTX 3090. In 4K.

And this is only with 10% IPC increase.

This is all theoretical calculation.

So we better pray, that 256 bit bus is enough to feed those CUs, and that AMD was able to find a way, that those CUs scale in performance, similarly to RDNA1 architecture.

Kenmitch · Oct 12, 2020

Konan said:
Sorry, I mean show it, have it evaluated and it works to a competing level (or close). Not powerpoint slides saying so.

The 28th get's closer by the day. It would be silly if they didn't demo it's use in action.

Kenmitch · Oct 12, 2020

Something else to speculate on?

AMD May Resurrect Project Quantum Mini Gaming PC

Maybe Sony and Microsoft should be worried.

www.tomshardware.com

Kuiva maa · Oct 12, 2020

Kenmitch said:
Something else to speculate on?

AMD May Resurrect Project Quantum Mini Gaming PC

Maybe Sony and Microsoft should be worried.

www.tomshardware.com

Adam Jensen owned one in Deus Ex:Mankind Divided and the timeline for it was late 2029.
https://steamuserimages-a.akamaihd....079/6E1F074FC09E38C85334389B0DE88F6DCFCCE3EB/

uzzi38 · Oct 12, 2020

You: "The 3080 is 80% faster than the 5700XT showing that AMD must be getting roughly 80% scaling with double the CUs"

Me: "No it seems to show 100% scaling if they can actually pull off 3080 performance *provides evidence"

You:

Guru said:
So? Its a different architecture! There is no way 80 cu's scale linearry.

kurosaki · Oct 12, 2020

Konan said:
Sorry, I mean show it, have it evaluated and it works to a competing level (or close). Not powerpoint slides saying so.

Well, the two small companies MS and Sony has just invested heavily in the technology going forward a couple of years. Would be a shame if the ppt slide just was a hoax...

Question Speculation: RDNA2 + CDNA Architectures thread

Platinum Member

Platinum Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Senior member

Senior member

Senior member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Senior member

Diamond Member

Senior member

Diamond Member

Senior member

Diamond Member

Diamond Member

Diamond Member

Member

Platinum Member

Senior member