Question Speculation: RDNA2 + CDNA Architectures thread

Page 18 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Geranium

Member
Apr 22, 2020
43
37
51
I've now said this multiple times!

The different figures relative to FP64 for FP32/16 are from new ML op focused hardware which performs better for ML tasks - I don't expect those high numbers to be reflected on general compute tasks that do not fit that use case.

The FP64 number fits comfortably for a 120 CU server/workstation part designed to be run as one of up to 8 in a rack, so lower clockspeeds as with EPYC are to be expected.

Otherwise you are implying that a 120 CU part is running at what, 2.734 Ghz?

You must realise how ridiculous that sounds on any variant of 7nm node.
Running 120/128CU @ >2GHz is a miracle, that is why I said something is of with those FP number.
 

beginner99

Diamond Member
Jun 2, 2009
4,539
885
126
- Aye... obligatory "CHOO CHOO, all aboard the AMD pre-release hype-train, heading to a derailment nightmare near you soon!"
I mean the 200% number over navi10 is pretty believable and in fact what you would expect from an actual flagship in late 2020. navi10 at around 300mm2 was a midrange card at best. The real wake-up call or derailment will be performance per dollar. If the sell this card for $800, then no gain was made whatsoever in performance/$, double price for double performance and it's still a win as it's much faster as a 2080TI and cheaper. $699 is the absolute minimum this card will launch at, which will give a very minor boost in terms of performance/$ but nothing compared to the olden days.
 

CastleBravo

Junior Member
Dec 6, 2019
15
10
41
I mean the 200% number over navi10 is pretty believable and in fact what you would expect from an actual flagship in late 2020. navi10 at around 300mm2 was a midrange card at best. The real wake-up call or derailment will be performance per dollar. If the sell this card for $800, then no gain was made whatsoever in performance/$, double price for double performance and it's still a win as it's much faster as a 2080TI and cheaper. $699 is the absolute minimum this card will launch at, which will give a very minor boost in terms of performance/$ but nothing compared to the olden days.

When was the last time a top end card was released with more performance/$ than the last gen mid-range cards?I don't see them releasing a card with double the performance of the 5700 XT for only $800. If they can beat the 2080 ti by ~40% for ~$1k, that will be a huge win.

Also, raw FPS/$ is a bad metric. When I am shopping for a GPU upgrade, I look at FPS gained over my current card per dollar, and the top end cards usually end up being the better value.
 

Paul98

Diamond Member
Jan 31, 2010
3,676
100
106
I mean the 200% number over navi10 is pretty believable and in fact what you would expect from an actual flagship in late 2020. navi10 at around 300mm2 was a midrange card at best. The real wake-up call or derailment will be performance per dollar. If the sell this card for $800, then no gain was made whatsoever in performance/$, double price for double performance and it's still a win as it's much faster as a 2080TI and cheaper. $699 is the absolute minimum this card will launch at, which will give a very minor boost in terms of performance/$ but nothing compared to the olden days.
200% increase in performance isn't double, that's triple the performance. But it shouldn't be a hard mark to hit, as we would be going from mid range, to a high end, with major architecture changes, updated process.
 

lobz

Golden Member
Feb 10, 2017
1,250
1,244
106
200% increase in performance isn't double, that's triple the performance. But it shouldn't be a hard mark to hit, as we would be going from mid range, to a high end, with major architecture changes, updated process.
It's not 'over', but 'of'.
 

GodisanAtheist

Platinum Member
Nov 16, 2006
2,217
598
136
It's not 'over', but 'of'.
- The post I replied to specifically stated "Faster than". 225% faster than a 5700xt is not the same thing as 225% the performance of a 5700xt.

The prior claim is patently outrageous, even if they meant the far more reasonable later claim.
 

lobz

Golden Member
Feb 10, 2017
1,250
1,244
106
- The post I replied to specifically stated "Faster than". 225% faster than a 5700xt is not the same thing as 225% the performance of a 5700xt.

The prior claim is patently outrageous, even if they meant the far more reasonable later claim.
I'd assume we all knew what everyone was talking about. Feel free to nitpick, nevertheless.
 

beginner99

Diamond Member
Jun 2, 2009
4,539
885
126
It's 251 mm2.

Where did you get 300mm2 from?

Even Vega 20 doesn't fit at 336mm2.
I think you meant 351mm2? I didn't bother to look up the exact value 300 or 351mm2 is purely mid range.


I don't see them releasing a card with double the performance of the 5700 XT for only $800. If they can beat the 2080 ti by ~40% for ~$1k, that will be a huge win.
Well NV is for sure preparing for a fight as the 3080 this time around will use the GA102 die, not the smaller 104 die. Given the leaks this 102 die is roughly 20-30% faster than a 2080TI. 2080 launched at $699 so we can expect a similar price again. Yeah the AMD chip would according to this info be faster, but $1000? No way. if 3080 launches for $699, AMD can't charge more than $799 even if they are faster. Assuming they want to have a meaningful number of sales.
If you frequent the forums you should knwo I don't really like NV but at somepoint one must acknowledge their advantage on the software side. And I don't mean just drivers. DLSS might play a role for some, and for me due to my work in data science I would really see an advanatge in beign able to easily run all the deep learning stuff (Eg CUDA) which now on my old AMD card simply isn't possible.
 

soresu

Golden Member
Dec 19, 2014
1,324
520
136
I think you meant 351mm2? I didn't bother to look up the exact value 300 or 351mm2 is purely mid range.
Nope.

It's still 251 mm2.

TechPowerUp. Wikipedia.

If you frequent the forums you should knwo I don't really like NV but at somepoint one must acknowledge their advantage on the software side. And I don't mean just drivers. DLSS might play a role for some, and for me due to my work in data science I would really see an advanatge in beign able to easily run all the deep learning stuff (Eg CUDA) which now on my old AMD card simply isn't possible.
I sympathise. No really.

I work with CG shading/lighting/lookdev, and there isn't a single commercial offline PT/RT renderer that does not use nVidia OptiX exclusively on their GPU backend - and I expect the as yet unreleased Pixar Renderman XPU to be just as exclusive.

It's quite thoroughly depressing how eager they all seem to be to enable nVidia lock in and hence expensive market monopoly.

It's not like Blender hasn't proved it can be done without it, even though nVidia are clearly trying their best to make it otherwise with fresh RTX bloat in the latest versions.

Even GPU renderer pioneers Octane clearly only troll nVidia with threats of first OpenCL, then Vulkan and now Metal versions to get them to pony up the sponsorship money - a cannae business trick but very annoying.
 
Last edited:
  • Like
Reactions: Tlh97 and Antey

DisEnchantment

Senior member
Mar 3, 2017
578
1,077
106
https://lists.freedesktop.org/archives/amd-gfx/attachments/20200727/dfe69333/attachment-0001.obj

Seems like Sienna (at least some variant of it) will have ECC and possibly HBM (2048 bit interface)
Seems like Navy Flounder does not have it.
GMC10 updated to handle this ECC/RAS update.
UMC 6.1 is for Arcturus.
UMC 8.7 is for Sienna.

UMC

C++:
+/* HBM  Memory Channel Width */
+#define UMC_V8_7_HBM_MEMORY_CHANNEL_WIDTH 128
+/* number of umc channel instance with memory map register access */
+#define UMC_V8_7_CHANNEL_INSTANCE_NUM 2
+/* number of umc instance with memory map register access */
+#define UMC_V8_7_UMC_INSTANCE_NUM 8
+/* total channel instances in one umc block */
+#define UMC_V8_7_TOTAL_CHANNEL_NUM (UMC_V8_7_CHANNEL_INSTANCE_NUM * UMC_V8_7_UMC_INSTANCE_NUM)
+/* UMC regiser per channel offset */
+#define UMC_V8_7_PER_CHANNEL_OFFSET_SIENNA    0x400
UMC_V8_7_TOTAL_CHANNEL_NUM (UMC_V8_7_CHANNEL_INSTANCE_NUM * UMC_V8_7_UMC_INSTANCE_NUM) = 16.
Bus width = Num channels * Channel Width = 16*128 = 2048

GMC

C++:
+static void gmc_v10_0_set_umc_funcs(struct amdgpu_device *adev)
+{
+ switch (adev->asic_type) {
+ case CHIP_SIENNA_CICHLID:
+ adev->umc.max_ras_err_cnt_per_query = UMC_V8_7_TOTAL_CHANNEL_NUM;
+ adev->umc.channel_inst_num = UMC_V8_7_CHANNEL_INSTANCE_NUM;
+ adev->umc.umc_inst_num = UMC_V8_7_UMC_INSTANCE_NUM;
+ adev->umc.channel_offs = UMC_V8_7_PER_CHANNEL_OFFSET_SIENNA;
+ adev->umc.channel_idx_tbl = &umc_v8_7_channel_idx_tbl[0][0];
+ adev->umc.funcs = &umc_v8_7_funcs;
+ break;
+ default:
+ break;
+ }
+}
There are some final bits missing to confirm this, but it seems like 2048 bit HBM will be present for Sienna (at least some variants)
Eventually the bus width is read from atombios, but 2048 would be the max. (2048 bit Flashbolt would give 819 GB/s or if Hynix HBM2e would be an astounding 921 GB/s, but if anything it would probably downclocked versions of these)

In comparson, for Arcturus
C++:
/* HBM  Memory Channel Width */
#define UMC_V6_1_HBM_MEMORY_CHANNEL_WIDTH    128
/* number of umc channel instance with memory map register access */
#define UMC_V6_1_CHANNEL_INSTANCE_NUM        4
/* number of umc instance with memory map register access */
#define UMC_V6_1_UMC_INSTANCE_NUM        8
/* total channel instances in one umc block */
#define UMC_V6_1_TOTAL_CHANNEL_NUM    (UMC_V6_1_CHANNEL_INSTANCE_NUM * UMC_V6_1_UMC_INSTANCE_NUM)
/* UMC regiser per channel offset */
#define UMC_V6_1_PER_CHANNEL_OFFSET_VG20    0x800
#define UMC_V6_1_PER_CHANNEL_OFFSET_ARCT    0x400
Bus width = 4096 (4*8*128).
 
Last edited:

Glo.

Diamond Member
Apr 25, 2015
3,919
1,824
136
I know how it will sound but... how big of a chance there is that Na21, Sienna Cichlid has both: GDDR6 and HBM2 on package?

Because we have two rumors. N21 has 16 GB of VRAM.

And then: Consumer cards do not use HBM2, anymore...
 

uzzi38

Senior member
Oct 16, 2019
894
1,104
96
I know how it will sound but... how big of a chance there is that Na21, Sienna Cichlid has both: GDDR6 and HBM2 on package?

Because we have two rumors. N21 has 16 GB of VRAM.

And then: Consumer cards do not use HBM2, anymore...
Yeah, it's happening. Personally I don't think those HBM2 versions will come to consumers though, but that's a guess.

I think the consumer market will be restricted to 384-bit GDDR6
 

Krteq

Senior member
May 22, 2015
830
356
136
I know how it will sound but... how big of a chance there is that Na21, Sienna Cichlid has both: GDDR6 and HBM2 on package?

Because we have two rumors. N21 has 16 GB of VRAM.

And then: Consumer cards do not use HBM2, anymore...
Well... those were just rumors.

These AMDGPU code commits are first concrete info about memory controller used in Sienna Cichlid GPU.
 

Stuka87

Diamond Member
Dec 10, 2010
4,831
538
126
A memory controller that can support both types of memory would have to be huge. They might have two versions of what is basically the same chip but with different memory controllers. But I don't see them wasting die space for two memory controllers, plus the pathing for both. It would be a nightmare from a layout perspective.
 

Olikan

Golden Member
Sep 23, 2011
1,943
93
91
Mmm, an Apple exclusive?... We didn't had recent rumour that RDNA2 might be ~470mm2, while original rumour was 505mm2

505 vs ~470...Navi10 vs navi12... GDDR6 vs HBM
 

DisEnchantment

Senior member
Mar 3, 2017
578
1,077
106
Mmm, an Apple exclusive?... We didn't had recent rumour that RDNA2 might be ~470mm2, while original rumour was 505mm2

505 vs ~470...Navi10 vs navi12... GDDR6 vs HBM
384bit GDDR6 controller+PHY would take around ~100mm2 of die space on N7. 2048bit HBM2 would take around ~26mm2 (see link below).
So it still does not add up (for the 505 vs ~470.) There are possibilities but we dont which exactly.

1595857789292.png
 

soresu

Golden Member
Dec 19, 2014
1,324
520
136
384bit GDDR6 controller+PHY would take around ~100mm2 of die space on N7. 2048bit HBM2 would take around ~26mm2 (see link below).
So it still does not add up (for the 505 vs ~470.) There are possibilities but we dont which exactly.

View attachment 26852
Hmmm, here was me thinking that the HBM controllers were bigger.

Lesson learned.
 

Olikan

Golden Member
Sep 23, 2011
1,943
93
91
384bit GDDR6 controller+PHY would take around ~100mm2 of die space on N7. 2048bit HBM2 would take around ~26mm2 (see link below).
So it still does not add up (for the 505 vs ~470.) There are possibilities but we dont which exactly.

View attachment 26852
IT FITS!
Leak was from MLID: 427mm^2, 72 CU, 2.15Ghz boost

i messes up
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,019
1,310
136
Does It?
Isn't the new Xbox SoC just 360mm^2? It has 56CU RDNA2 GPU and 8Cores + 320bit GDDR6 memory bus, south bridge and so on.
2 renoir sized CCX's are pretty small and remember a regular GPU has a 16x PCI interface, the consoles if they have a south bridge would likely only need 12 and none of the other misc IO.

edit: here is ps4pro , not much "uncore " about
https://flic.kr/p/JkEiRK
 

ASK THE COMMUNITY