• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Question Speculation: RDNA2 + CDNA Architectures thread

Page 19 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
edited this post, because it is not so meaningful in light of recent Mesa updates. :expressionless:

It seems Sienna and Navy are both GFX1030. Sienna = HBM and Navy = GDDR6. Both Sienna and Navy are Big Navi.
At least it seems both are similar (i.e. Polaris 12/11/VegaM)


C:
    case CHIP_RENOIR:
        return "gfx909";
    case CHIP_ARCTURUS:
        return "gfx908";
    case CHIP_NAVI10:
        return "gfx1010";
    case CHIP_NAVI12:
        return "gfx1011";
    case CHIP_NAVI14:
        return "gfx1012";
    case CHIP_SIENNA_CICHLID:
    case CHIP_NAVY_FLOUNDER:
        return "gfx1030";

So it is untrue that Navy Flounder is Navi22/GFX1031 as reported here
 
Last edited:
2 renoir sized CCX's are pretty small and remember a regular GPU has a 16x PCI interface, the consoles if they have a south bridge would likely only need 12 and none of the other misc IO.

edit: here is ps4pro , not much "uncore " about
2 Renoir CCX's are ~40-50mm2 and GDDR6 PHY and memory controller is much bigger than HBM2 PHY, just look at Navi10 vs Navi12. Here is also a dieshot of Renoir here
So there is no reason why a 72CU GPU should be much bigger than Xbox with 56CU. Keep in mind that CUs are very small 2.1mm2 for RDNA1.
 
Last edited:
Hmmm... no wonder code names and commit messages, commit cherry picking were so obscure and often misleading.

That's the goal - to get upstream support in place before launch without anyone knowing what happened
wink.png

On the plus side, ROCm will land for Navi1x/2x.
 
Meaning? Is it only the Ti models considered as high end?
You can call the top of the performance as you like.

The terrible Vega was more that 15% faster than the 980Ti. 0 to 15% faster that the 2080Ti will be another disaster unless is a reative small chip like 400mm tops. For the rumorerd 505mm chip will be a joke.
 
You can call the top of the performance as you like.

The terrible Vega was more that 15% faster than the 980Ti. 0 to 15% faster that the 2080Ti will be another disaster unless is a reative small chip like 400mm tops. For the rumorerd 505mm chip will be a joke.

If it's cheaper, why complain? If you think people are going to continue to pay for high GPU prices in the midst of C19, and other world factors, you have another thing coming. The console is finally pretty good in the graphics department.
 
If it's cheaper, why complain? If you think people are going to continue to pay for high GPU prices in the midst of C19, and other world factors, you have another thing coming. The console is finally pretty good in the graphics department.

I think it would be a problem for a few reasons:

- Its not a sustainable business model in this market to pump out inferior products and sell them for narrower margins than the competition. AMD being perpetually behind NV means they exit the discreet GPU space or become irrelevant in due time.

- Lower margins means less money to put into the intangibles, like additional software features that are more and more separating the wheat from the chaff.

- Not having the halo product in this space means that your competition will effectively always be able to match your performance and likely have more room to play around with price, which comes back to the first point.

- Most distressing, IMO, is it means AMD doesn't have the mental muscle or will to catch NV even if NV makes a mistep (if all the rumors surrounding the node drama are true).
 
Coreteks seems to think Big Navi isn't really that big. Supposedly only 15% faster than 2080 Ti in AMD-optimized titles:
I'm a little skeptical... Only 15% faster, and that's with AMD optimized titles?

According to ComputerBase, the 2080 Ti FE is on average 53% faster than the 5700XT at 4K across a wide variety of games. If we assume 72 CUs and similar clocks, Big Navi should be 80% faster than a 5700XT, which makes it ~18% faster than the 2080 Ti. If you add in IPC gains or the chance that there's actually 80 CUs, not 72, then it ought to be closer to 30% faster on average.

https://www.computerbase.de/thema/grafikkarte/rangliste/#diagramm-performancerating-3840-2160
 
Does It?
Isn't the new Xbox SoC just 360mm^2? It has 56CU RDNA2 GPU and 8Cores + 320bit GDDR6 memory bus, south bridge and so on.
You cant measure dedicated GPU die size from Custom APU.
Xbox's APU dont have many unit that the dedicated GPU will have. Xbox's APU has less sophisticated display engine, no PCI-e root complex, probably no FP64 unit, less complex encode/decode engine, memory controller without ECC support, also probably less cache.
 
I'm a little skeptical... Only 15% faster, and that's with AMD optimized titles?

According to ComputerBase, the 2080 Ti FE is on average 53% faster than the 5700XT at 4K across a wide variety of games. If we assume 72 CUs and similar clocks, Big Navi should be 80% faster than a 5700XT, which makes it ~18% faster than the 2080 Ti. If you add in IPC gains or the chance that there's actually 80 CUs, not 72, then it ought to be closer to 30% faster on average.

https://www.computerbase.de/thema/grafikkarte/rangliste/#diagramm-performancerating-3840-2160
53% is best case senerio for RTX 2080 Ti. On TechPowerUp list 2080 Ti is only 34% faster at 1080p, 42% at 1440p and only 49% on 2160p. But for high resolution 2080 Ti has more Sadder, ROP and Bandwidth than RX 5700 XT.
 
People will get good price on GPU's if people keep buying Nvidia.
I had a 5970 and 7970. Only miss the 290X that was the last AMD good high end card and that was freaking 2013.

Then 980Ti and 1080Ti. If AMD lauch the 3rd turd in a row (Fiji, Vega) who buys high end will continue to only have Nvidia as option.
 
I'm a little skeptical... Only 15% faster, and that's with AMD optimized titles?

According to ComputerBase, the 2080 Ti FE is on average 53% faster than the 5700XT at 4K across a wide variety of games. If we assume 72 CUs and similar clocks, Big Navi should be 80% faster than a 5700XT, which makes it ~18% faster than the 2080 Ti. If you add in IPC gains or the chance that there's actually 80 CUs, not 72, then it ought to be closer to 30% faster on average.

https://www.computerbase.de/thema/grafikkarte/rangliste/#diagramm-performancerating-3840-2160

The 5700xt is clocked way past its efficient point. Scale it up sizewise and the TDP very quickly starts to limit your performance. You're much more likely to scale up at 5700 style clocks.

Even with the marketing,so a priori dubious, 50% efficiency gain claim.

We'll see.
 
I had a 5970 and 7970. Only miss the 290X that was the last AMD good high end card and that was freaking 2013.

Then 980Ti and 1080Ti. If AMD lauch the 3rd turd in a row (Fiji, Vega) who buys high end will continue to only have Nvidia as option.
Are you telling that GTX 980 Ti was a turd also cause it was only 2-5% faster than Fury X.
Only GTX 1080 Ti was 30% faster than Vega 64 but was also expensive than Vega 64. And Vega 64 was so turd that Apple used those on their expensie iMac.
 
Last edited:
edited this post, because it is not so meaningful in light of recent Mesa updates. :expressionless:

It seems Sienna and Navy are both GFX1030. Sienna = HBM and Navy = GDDR6. Both Sienna and Navy are Big Navi.
At least it seems both are similar (i.e. Polaris 12/11/VegaM)


C:
    case CHIP_RENOIR:
        return "gfx909";
    case CHIP_ARCTURUS:
        return "gfx908";
    case CHIP_NAVI10:
        return "gfx1010";
    case CHIP_NAVI12:
        return "gfx1011";
    case CHIP_NAVI14:
        return "gfx1012";
    case CHIP_SIENNA_CICHLID:
    case CHIP_NAVY_FLOUNDER:
        return "gfx1030";

So it is untrue that Navy Flounder is Navi22/GFX1031 as reported here
Not surprising. Apple will use those, so HBM is must.

^ 2 module in 2048-bit configuration can give 921.6 GBps of bandwidth, which would be enough for "Big Navi".
 
CDNA MI100 seems to be 46TFLOPS for FP16.
46 vs 30 about +50%

That is faster than the A100 of nvidia for SGEMM and with a smaller die.
But I'm not sure if this score of the A100 is correct.
According to the specs the A100 should be faster. (also has more transistors)

MI60 https://www.techpowerup.com/gpu-specs/radeon-instinct-mi60.c3233
MI100 https://www.techpowerup.com/gpu-specs/amd-mi100.g927
A100 https://www.techpowerup.com/gpu-specs/a100-pcie.c3623
 
Last edited:
I don't get this one
If MI100 is 2,4x faster in FP32 and 2x slower in FP16,where in A100 FP16 is 4x FP32, it would mean MI100 FP32 is faster than its FP16? Ok, slide says "up to", but still it would mean FP32=FP16 in TFLOPS?
I mean, MI50/60 have FP16:FP32 as 2:1, so FP16 in MI100 should be at least the same as in A100. Similar is for FP64.

I mean this is CDNA, this chip should be optmized for Compute, it makes no sense to have lower FP64 and FP16 performance than MI60

Edit:
So I just found this one from few days ago
Here they say it has 9,5 TFLOPS FP64 (same as A100) and 150 TFLOPS FP16 (2x more than A100)
So what am I missing?
 
Last edited:
150 TFLOPS FP16 (2x more than A100)
Eh? That's not right, that number is half the A100 FP16 (312 TFLOPs) more like. Link.

Though I'm not sure if the A100 number doesn't come from the sparse AI tensor ops peak perf:

GTC_PPB_09.jpg


If so and the 150 FP16 Arcturus figure is accurate then CDNA1 and Ampere may well be close on certain counts.

With all these numbers floating around (hahaha pun), we will probably go mad speculating on their meaning until the official AMD presentation and slides give us both accurate numbers and an explanation to fit them.

Given that they just reaffirmed Zen3, RDNA2 and CDNA1 are all coming out this year, it's likely a huge mega announcement is coming that covers it all at once with deep dives to follow
 
Something to bear in mind, AMD made the very odd move of talking about CDNA2 before CDNA1 products had even been announced, and I don't think that this was simply because of the future HPC/supercomputer contract(s) that they had recently won with CDNA2.

It seems that much as with RDNA1 we will get a not so dramatically impressive move forward to the new compute accelerator (not GPU) paradigm with CDNA1 - but that the xDNA2 generations will be the real intended demonstrators for AMD's new diverged accelerator uArch strategy.
 
Eh? That's not right, that number is half the A100 FP16 (312 TFLOPs) more like. Link.
But that is from Tensor cores, and IIUC, can be used only for DL/NN applications, not for general computing. Or even if it could, does it require additional code change or nVidia's hardware/software handles it automatically?
 
But that is from Tensor cores, and IIUC, can be used only for DL/NN applications, not for general computing. Or even if it could, does it require additional code change or nVidia's hardware/software handles it automatically?

Tensor cores can only perform specific types of matrix math. They can not be used for anything outside of that.
 
But that is from Tensor cores, and IIUC, can be used only for DL/NN applications, not for general computing.
Given that, if you go off the A100 FP64 figure of 9.7 TFLOPS x4 (39.2 TFLOPS) then both GPU's should be within about 800 GFLOPS of each other for FP16 general compute use cases (Arcturus being 38 TFLOPS) - assuming those shader cores even do general compute for FP16 in A100 that is.
 
Back
Top