Question Speculation: RDNA2 + CDNA Architectures thread

uzzi38 · Apr 28, 2020

All die sizes are within 5mm^2. The poster here has been right on some things in the past afaik, and to his credit was the first to saying 505mm^2 for Navi21, which other people have backed up. Even still though, take the following with a pich of salt.

Navi21 - 505mm^2

Navi22 - 340mm^2

Navi23 - 240mm^2

Source is the following post: https://www.ptt.cc/bbs/PC_Shopping/M.1588075782.A.C1E.html

Geranium · Jul 23, 2020

soresu said:
I've now said this multiple times!

The different figures relative to FP64 for FP32/16 are from new ML op focused hardware which performs better for ML tasks - I don't expect those high numbers to be reflected on general compute tasks that do not fit that use case.

The FP64 number fits comfortably for a 120 CU server/workstation part designed to be run as one of up to 8 in a rack, so lower clockspeeds as with EPYC are to be expected.

Otherwise you are implying that a 120 CU part is running at what, 2.734 Ghz?

You must realise how ridiculous that sounds on any variant of 7nm node.

Running 120/128CU @ >2GHz is a miracle, that is why I said something is of with those FP number.

beginner99 · Jul 24, 2020

GodisanAtheist said:
- Aye... obligatory "CHOO CHOO, all aboard the AMD pre-release hype-train, heading to a derailment nightmare near you soon!"

I mean the 200% number over navi10 is pretty believable and in fact what you would expect from an actual flagship in late 2020. navi10 at around 300mm2 was a midrange card at best. The real wake-up call or derailment will be performance per dollar. If the sell this card for $800, then no gain was made whatsoever in performance/$, double price for double performance and it's still a win as it's much faster as a 2080TI and cheaper. $699 is the absolute minimum this card will launch at, which will give a very minor boost in terms of performance/$ but nothing compared to the olden days.

CastleBravo · Jul 24, 2020

beginner99 said:
I mean the 200% number over navi10 is pretty believable and in fact what you would expect from an actual flagship in late 2020. navi10 at around 300mm2 was a midrange card at best. The real wake-up call or derailment will be performance per dollar. If the sell this card for $800, then no gain was made whatsoever in performance/$, double price for double performance and it's still a win as it's much faster as a 2080TI and cheaper. $699 is the absolute minimum this card will launch at, which will give a very minor boost in terms of performance/$ but nothing compared to the olden days.

When was the last time a top end card was released with more performance/$ than the last gen mid-range cards?I don't see them releasing a card with double the performance of the 5700 XT for only $800. If they can beat the 2080 ti by ~40% for ~$1k, that will be a huge win.

Also, raw FPS/$ is a bad metric. When I am shopping for a GPU upgrade, I look at FPS gained over my current card per dollar, and the top end cards usually end up being the better value.

Paul98 · Jul 24, 2020

beginner99 said:
I mean the 200% number over navi10 is pretty believable and in fact what you would expect from an actual flagship in late 2020. navi10 at around 300mm2 was a midrange card at best. The real wake-up call or derailment will be performance per dollar. If the sell this card for $800, then no gain was made whatsoever in performance/$, double price for double performance and it's still a win as it's much faster as a 2080TI and cheaper. $699 is the absolute minimum this card will launch at, which will give a very minor boost in terms of performance/$ but nothing compared to the olden days.

200% increase in performance isn't double, that's triple the performance. But it shouldn't be a hard mark to hit, as we would be going from mid range, to a high end, with major architecture changes, updated process.

lobz · Jul 24, 2020

Paul98 said:
200% increase in performance isn't double, that's triple the performance. But it shouldn't be a hard mark to hit, as we would be going from mid range, to a high end, with major architecture changes, updated process.

It's not 'over', but 'of'.

GodisanAtheist · Jul 24, 2020

lobz said:
It's not 'over', but 'of'.

- The post I replied to specifically stated "Faster than". 225% faster than a 5700xt is not the same thing as 225% the performance of a 5700xt.

The prior claim is patently outrageous, even if they meant the far more reasonable later claim.

lobz · Jul 24, 2020

GodisanAtheist said:
- The post I replied to specifically stated "Faster than". 225% faster than a 5700xt is not the same thing as 225% the performance of a 5700xt.

The prior claim is patently outrageous, even if they meant the far more reasonable later claim.

I'd assume we all knew what everyone was talking about. Feel free to nitpick, nevertheless.

GodisanAtheist · Jul 24, 2020

lobz said:
I'd assume we all knew what everyone was talking about. Feel free to nitpick, nevertheless.

- I've interacted with other people enough in my time here on earth to hear what they say, not what I think they mean 😀. But fair point, this nit has been picked.

soresu · Jul 25, 2020

beginner99 said:
navi10 at around 300mm2 was a midrange card at best

It's 251 mm2.

Where did you get 300mm2 from?

Even Vega 20 doesn't fit at 336mm2.

beginner99 · Jul 25, 2020

soresu said:
It's 251 mm2.

Where did you get 300mm2 from?

Even Vega 20 doesn't fit at 336mm2.

I think you meant 351mm2? I didn't bother to look up the exact value 300 or 351mm2 is purely mid range.

CastleBravo said:
I don't see them releasing a card with double the performance of the 5700 XT for only $800. If they can beat the 2080 ti by ~40% for ~$1k, that will be a huge win.

Well NV is for sure preparing for a fight as the 3080 this time around will use the GA102 die, not the smaller 104 die. Given the leaks this 102 die is roughly 20-30% faster than a 2080TI. 2080 launched at $699 so we can expect a similar price again. Yeah the AMD chip would according to this info be faster, but $1000? No way. if 3080 launches for $699, AMD can't charge more than $799 even if they are faster. Assuming they want to have a meaningful number of sales.
If you frequent the forums you should knwo I don't really like NV but at somepoint one must acknowledge their advantage on the software side. And I don't mean just drivers. DLSS might play a role for some, and for me due to my work in data science I would really see an advanatge in beign able to easily run all the deep learning stuff (Eg CUDA) which now on my old AMD card simply isn't possible.

soresu · Jul 25, 2020

beginner99 said:
I think you meant 351mm2? I didn't bother to look up the exact value 300 or 351mm2 is purely mid range.

Nope.

It's still 251 mm2.

TechPowerUp. Wikipedia.

beginner99 said:
If you frequent the forums you should knwo I don't really like NV but at somepoint one must acknowledge their advantage on the software side. And I don't mean just drivers. DLSS might play a role for some, and for me due to my work in data science I would really see an advanatge in beign able to easily run all the deep learning stuff (Eg CUDA) which now on my old AMD card simply isn't possible.

I sympathise. No really.

I work with CG shading/lighting/lookdev, and there isn't a single commercial offline PT/RT renderer that does not use nVidia OptiX exclusively on their GPU backend - and I expect the as yet unreleased Pixar Renderman XPU to be just as exclusive.

It's quite thoroughly depressing how eager they all seem to be to enable nVidia lock in and hence expensive market monopoly.

It's not like Blender hasn't proved it can be done without it, even though nVidia are clearly trying their best to make it otherwise with fresh RTX bloat in the latest versions.

Even GPU renderer pioneers Octane clearly only troll nVidia with threats of first OpenCL, then Vulkan and now Metal versions to get them to pony up the sponsorship money - a cannae business trick but very annoying.

DisEnchantment · Jul 27, 2020

https://lists.freedesktop.org/archives/amd-gfx/attachments/20200727/dfe69333/attachment-0001.obj

https://lists.freedesktop.org/archives/amd-gfx/attachments/20200727/5df202da/attachment.obj

Seems like Sienna (at least some variant of it) will have ECC and possibly HBM (2048 bit interface)
Seems like Navy Flounder does not have it.
GMC10 updated to handle this ECC/RAS update.
UMC 6.1 is for Arcturus.
UMC 8.7 is for Sienna.

UMC

C++:

+/* HBM  Memory Channel Width */
+#define UMC_V8_7_HBM_MEMORY_CHANNEL_WIDTH 128
+/* number of umc channel instance with memory map register access */
+#define UMC_V8_7_CHANNEL_INSTANCE_NUM 2
+/* number of umc instance with memory map register access */
+#define UMC_V8_7_UMC_INSTANCE_NUM 8
+/* total channel instances in one umc block */
+#define UMC_V8_7_TOTAL_CHANNEL_NUM (UMC_V8_7_CHANNEL_INSTANCE_NUM * UMC_V8_7_UMC_INSTANCE_NUM)
+/* UMC regiser per channel offset */
+#define UMC_V8_7_PER_CHANNEL_OFFSET_SIENNA    0x400

UMC_V8_7_TOTAL_CHANNEL_NUM (UMC_V8_7_CHANNEL_INSTANCE_NUM * UMC_V8_7_UMC_INSTANCE_NUM) = 16.
Bus width = Num channels * Channel Width = 16*128 = 2048

GMC

C++:

+static void gmc_v10_0_set_umc_funcs(struct amdgpu_device *adev)
+{
+ switch (adev->asic_type) {
+ case CHIP_SIENNA_CICHLID:
+ adev->umc.max_ras_err_cnt_per_query = UMC_V8_7_TOTAL_CHANNEL_NUM;
+ adev->umc.channel_inst_num = UMC_V8_7_CHANNEL_INSTANCE_NUM;
+ adev->umc.umc_inst_num = UMC_V8_7_UMC_INSTANCE_NUM;
+ adev->umc.channel_offs = UMC_V8_7_PER_CHANNEL_OFFSET_SIENNA;
+ adev->umc.channel_idx_tbl = &umc_v8_7_channel_idx_tbl[0][0];
+ adev->umc.funcs = &umc_v8_7_funcs;
+ break;
+ default:
+ break;
+ }
+}

There are some final bits missing to confirm this, but it seems like 2048 bit HBM will be present for Sienna (at least some variants)
Eventually the bus width is read from atombios, but 2048 would be the max. (2048 bit Flashbolt would give 819 GB/s or if Hynix HBM2e would be an astounding 921 GB/s, but if anything it would probably downclocked versions of these)

In comparson, for Arcturus

C++:

/* HBM  Memory Channel Width */
#define UMC_V6_1_HBM_MEMORY_CHANNEL_WIDTH    128
/* number of umc channel instance with memory map register access */
#define UMC_V6_1_CHANNEL_INSTANCE_NUM        4
/* number of umc instance with memory map register access */
#define UMC_V6_1_UMC_INSTANCE_NUM        8
/* total channel instances in one umc block */
#define UMC_V6_1_TOTAL_CHANNEL_NUM    (UMC_V6_1_CHANNEL_INSTANCE_NUM * UMC_V6_1_UMC_INSTANCE_NUM)
/* UMC regiser per channel offset */
#define UMC_V6_1_PER_CHANNEL_OFFSET_VG20    0x800
#define UMC_V6_1_PER_CHANNEL_OFFSET_ARCT    0x400

Bus width = 4096 (4*8*128).

Krteq · Jul 27, 2020

Nice find!

Glo. · Jul 27, 2020

I know how it will sound but... how big of a chance there is that Na21, Sienna Cichlid has both: GDDR6 and HBM2 on package?

Because we have two rumors. N21 has 16 GB of VRAM.

And then: Consumer cards do not use HBM2, anymore...

uzzi38 · Jul 27, 2020

Glo. said:
I know how it will sound but... how big of a chance there is that Na21, Sienna Cichlid has both: GDDR6 and HBM2 on package?

Because we have two rumors. N21 has 16 GB of VRAM.

And then: Consumer cards do not use HBM2, anymore...

Yeah, it's happening. Personally I don't think those HBM2 versions will come to consumers though, but that's a guess.

I think the consumer market will be restricted to 384-bit GDDR6

Krteq · Jul 27, 2020

Glo. said:
I know how it will sound but... how big of a chance there is that Na21, Sienna Cichlid has both: GDDR6 and HBM2 on package?

Because we have two rumors. N21 has 16 GB of VRAM.

And then: Consumer cards do not use HBM2, anymore...

Well... those were just rumors.

These AMDGPU code commits are first concrete info about memory controller used in Sienna Cichlid GPU.

Stuka87 · Jul 27, 2020

A memory controller that can support both types of memory would have to be huge. They might have two versions of what is basically the same chip but with different memory controllers. But I don't see them wasting die space for two memory controllers, plus the pathing for both. It would be a nightmare from a layout perspective.

Olikan · Jul 27, 2020

Mmm, an Apple exclusive?... We didn't had recent rumour that RDNA2 might be ~470mm2, while original rumour was 505mm2

505 vs ~470...Navi10 vs navi12... GDDR6 vs HBM

DisEnchantment · Jul 27, 2020

Olikan said:
Mmm, an Apple exclusive?... We didn't had recent rumour that RDNA2 might be ~470mm2, while original rumour was 505mm2

505 vs ~470...Navi10 vs navi12... GDDR6 vs HBM

384bit GDDR6 controller+PHY would take around ~100mm2 of die space on N7. 2048bit HBM2 would take around ~26mm2 (see link below).
So it still does not add up (for the 505 vs ~470.) There are possibilities but we dont which exactly.

Erste Benchmarks und eine kurze Die-Analyse zu AMDs "Navi 12" Chip | 3DCenter.org

Erste seitens Videocardz sowie Twitterer Rogame ausgegrabene Benchmark-Werte zur Radeon Pro 5600M auf Navi-12-Basis zeigen Apples neue Mobile-Lösung beachtbar unterhalb der Performance einer Radeon RX 5600M auf Navi-10-Basis – welche ihrerseits mit 36

www.3dcenter.org

soresu · Jul 27, 2020

DisEnchantment said:
384bit GDDR6 controller+PHY would take around ~100mm2 of die space on N7. 2048bit HBM2 would take around ~26mm2 (see link below).
So it still does not add up (for the 505 vs ~470.) There are possibilities but we dont which exactly.

View attachment 26852

Erste Benchmarks und eine kurze Die-Analyse zu AMDs "Navi 12" Chip | 3DCenter.org

Erste seitens Videocardz sowie Twitterer Rogame ausgegrabene Benchmark-Werte zur Radeon Pro 5600M auf Navi-12-Basis zeigen Apples neue Mobile-Lösung beachtbar unterhalb der Performance einer Radeon RX 5600M auf Navi-10-Basis – welche ihrerseits mit 36

www.3dcenter.org

Hmmm, here was me thinking that the HBM controllers were bigger.

Lesson learned.

Olikan · Jul 27, 2020

DisEnchantment said:
384bit GDDR6 controller+PHY would take around ~100mm2 of die space on N7. 2048bit HBM2 would take around ~26mm2 (see link below).
So it still does not add up (for the 505 vs ~470.) There are possibilities but we dont which exactly.

View attachment 26852

Erste Benchmarks und eine kurze Die-Analyse zu AMDs "Navi 12" Chip | 3DCenter.org

Erste seitens Videocardz sowie Twitterer Rogame ausgegrabene Benchmark-Werte zur Radeon Pro 5600M auf Navi-12-Basis zeigen Apples neue Mobile-Lösung beachtbar unterhalb der Performance einer Radeon RX 5600M auf Navi-10-Basis – welche ihrerseits mit 36

www.3dcenter.org

IT FITS!
Leak was from MLID: 427mm^2, 72 CU, 2.15Ghz boost

i messes up

TESKATLIPOKA · Jul 27, 2020

Olikan said:
IT FITS!
Leak was from MLID: 427mm^2, 72 CU, 2.15Ghz boost

i messes up

Isn't that a lot of space for only 72CU + HBM2? For comparison Navi12 has 40CU + HBM2 and It's size is only 209mm^2.

beginner99 · Jul 27, 2020

TESKATLIPOKA said:
Isn't that a lot of space for only 72CU + HBM2? For comparison Navi12 has 40CU + HBM2 and It's size is only 209mm^2.

If each CU has more cache, then that easily explains the difference or alternatively due to RT hardware.

TESKATLIPOKA · Jul 27, 2020

beginner99 said:
If each CU has more cache, then that easily explains the difference or alternatively due to RT hardware.

Does It?
Isn't the new Xbox SoC just 360mm^2? It has 56CU RDNA2 GPU and 8Cores + 320bit GDDR6 memory bus, south bridge and so on.

itsmydamnation · Jul 27, 2020

TESKATLIPOKA said:
Does It?
Isn't the new Xbox SoC just 360mm^2? It has 56CU RDNA2 GPU and 8Cores + 320bit GDDR6 memory bus, south bridge and so on.

2 renoir sized CCX's are pretty small and remember a regular GPU has a 16x PCI interface, the consoles if they have a south bridge would likely only need 12 and none of the other misc IO.

edit: here is ps4pro , not much "uncore " about

https://flic.kr/p/JkEiRK

Question Speculation: RDNA2 + CDNA Architectures thread

Platinum Member

Member

Diamond Member

Member

Diamond Member

Platinum Member

Diamond Member

Platinum Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Golden Member

Golden Member

Diamond Member

Platinum Member

Golden Member

Diamond Member

Platinum Member

Golden Member

Diamond Member

Platinum Member

Platinum Member

Diamond Member

Platinum Member

Diamond Member