Discussion AMD XDNA AIE and FPGA Speculation and Discussion

moinmoin

Diamond Member
Jun 1, 2017
3,600
5,071
136
Next to Zen, RDNA and CDNA the Xilinx contributed XDNA is AMD's next major core IP joining.


(Source: slide 3 Mark Papermaster presentation)

It will first joining us with the Rembrandt sucessor Phoenix Point.


(Source: slide 11 Saeid Moshkelani presentation)


AI Engine (AIE) and FPGA are both part of AMD XDNA.



AIE is optimized for typical neural network trees.



AIE will come to Ryzen and Epyc and make them vastly more capable for AI applications.



AMD's and Xilinx's software stack will be unified to access AI capability across Zen, RDNA, CDNA and XDNA.





My hope is the above unified stack will also make the code portable to non-AMD systems, increasing the likelihood of it going into common code then making use of AIE if available.

(Source: slide 19, 20, 22, 23-25 Victor Peng presentation)

As @nicalandia pointed out AI Engine is an IP Xilinx presented earlier already. It is believed AMD originally licensed it before merger talks started.

The website actually has some more information on it not included in the PDF, like the actual output of each of the two types of AIE tile: https://www.xilinx.com/products/technology/ai-engine.html

 

nicalandia

Golden Member
Jan 10, 2019
1,647
1,952
106
I wonder if the IA Acceleration module used on Ryzen 7000 is part of the Xilinx IP or if it's using AMD's AI patent process.


1655073812059.png

Or that is due to AVX512 and RDNA GPU on the IO chiplet?
 
Last edited:

moinmoin

Diamond Member
Jun 1, 2017
3,600
5,071
136
I wonder if the IA Acceleration module used on Ryzen 7000 is part of the Xilinx IP or if it's using AMD's AI patent process.


View attachment 63002

Or that is due to AVX512 and RDNA GPU on the IO chiplet?
To me that sounds like the Xilinx IP indeed. The AI AVX512 instructions are part of the core, don't think splitting that off and putting that on a stacked accelerator is really feasible.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,241
3,734
136
The interesting thing to me about the AIE seems to be that it is architected in PIM layout. So they typically hit very close to peak throughout when running real workloads. And efficiency is higher too due to avoiding the constant data movement to and from memory unlike GPUs. However, they lose some efficiency once the entire network cannot fit on the entire device.
In a best case scenario you can just stream the entire data set in and out comes the final result of the inferencing without any trip to main memory for storing results of intermediate layers or activation functions.

Radically different compared to what i have seen in some ARM SoCs I have worked with which seems to be evolution of DSP SIMD VLIW blocks to handle low precision arithmetic. (e.g. mDSP, cDSP, SLPI, aDSP ... if you have ever heard of these in public domain)

In the slide you can also see they are used for signal processing. In Android, you can load an algorithm on such a block and it can trigger wakeup of the CPU when it detects a hotword from the mic or when it recognizes an image from the camera. Very good for power efficiency when using wakeup from sleep via camera or voice. You can put the CPU to sleep and let it work in the background.
Skype, MS Teams and Webex integrate echo and noise cancellation, would be interesting if AMD can get MS on board with this. Same thing with video con, you can blur or change background.
Very useful during work from home scenario or working across different geographies.
Remains to be seen if other software can integrate support for this AIE. But there's potential.

But I think AMD is lacking an AOP block.
 
Last edited:
Thread starter Similar threads Forum Replies Date
uzzi38 CPUs and Overclocking 305

ASK THE COMMUNITY