Not quite, RDNA still has general compute capabilities, it's just not focused on that, so likely no DP FP RDNA ever, and it will lack the tensor acceleration of CDNA too, so ML will not run nearly as well or efficiently on RDNA.For slide deck: https://videocardz.com/newz/amd-promises-rdna-2-navi-2x-late-2020-confirms-rdna-3-navi-3x
Thanks to cryptomining boom, now we see both companies move to pre-GPGPU concept again.
I can imagine the first thing they will do is nerf Navi10's fp64 capabilities and lots of IF function blocks. Navi10 has a lot more fp64 throughput than Turing.Not quite, RDNA still has general compute capabilities, it's just not focused on that, so likely no DP FP RDNA ever, and it will lack the tensor acceleration of CDNA too, so ML will not run nearly as well or efficiently on RDNA.
I guess for now CDNA is just a label put on GCN (Vega). The true departure will happen somewhere down the road, my guess is with CDNA3.
CDNA launch video, kinda came outta nowhere seems like. Haven't seen very many people report on it/mention it
Is there a write up of this anywhere?
Yes and they are wrong and i kept asking them to prove it and crickets.I did think I'd seen a few people here say that the A100 could somehow dual purpose it's tensor cores to give it a bunch more effective FP performance?
The entire point of HIP is portability from CUDA.and their clients will still prefer CUDA to HIP.
Arcturus/MI100 high level die shot annotations
This thing must be *massive*
-----------------------------------
Compute Array comparison between Arcturus/CDNA/MI100 and Vega10/GCN5/MI25
Pure compute, no graphics capability
Have you tried using it? There are many corner cases where HIPify doesn't actually work and you'll have to dig through and fix it manually. And as you say you must have to restrict yourself to CUDA8 which hasn't been a big deal but might be a step back for some people.The entire point of HIP is portability from CUDA.
Not just to AMD hardware, but also back to CUDA from HIP if you wish it = so you can keep a dual hardware codebase if you don't mind it lagging the CUDA state of the art a bit.
For machine learning applications which need to decode video, per the overview:Why is there supposedly VCN still included in these? Are the expected to be used for video decode/encode at all? Seems strange when all graphics capability has been stripped out.
the AMD CDNA family retains dedicated logic for HEVC, H.264, and VP9 decoding that is sometimes used for compute workloads that operate on multimedia data, such as machine learning for object detection
For machine learning applications which need to decode video, per the overview:
I think people were already estimating it to be in the low to mid-700mm2 range for die size based on the size of the HBM PHYs, which seems kind of large to be honest since it's got 50% extra CUs than Big Navi, which is estimated to be in the low 500mm2 range with a huge 128 MB LLC, yet has all of the graphics pipeline stripped out (no TMUs, ROPs, geometry engines, etc). Are the tensor cores and doubled register files really that space hungry? I wouldn't imagine so.This thing must be *massive*
On a similar node, it must be at least twice as big as Vega 20. Would this be the closest AMD has ever come to a reticle limit chip?
Seem to be winning some big deals but I guess if you're building a very expensive super computer you also have the money to hand tune the software to run well on the machine. Where this will suffer is selling to cloud vendors who rent them out and their clients will still prefer CUDA to HIP.