32GB for AT3 looks like overkill and really weird when AT2 will feature only 18/24 GByte of GDDR7 memory.
Maybe, but it would also allow AMD to ask a (relatively) high price for the AT3 top dog with LPDDR6 and cut the number of PCIe lanes even on AT3, which they didn't do on N44.
Also would allow to run local LLMs on it.
- 508 GByte/s should be enough for AT3. AT2 with 192bit GDDR7 running at 32 Gbps will have 768 GB/s of bandwidth, so +50% CU and +50% bandwidth would match perfectly
- And then use 16 or 24 GByte for AT3
We don't know yet if desktop AT2 will get more than the 64 active CUs the leaked slide from MLID suggested, in that case it'd only be 33% more CUs.
There are many 48 Gbit LPDDR5X-9600 with x64 listed. Depends if Samsung can crank up the speed of such modules.
They're lower capacity because they're older generations on older, already fairly mature processes, so there's unlikely to be much of an improvement.
And the presented 12.7 GT/s module uses only 16 Gbit chips:
https://ieeexplore.ieee.org/document/10904794
- 8 * 16 Gbit = 16 GByte
16 GByte for both AT3 and AT4 would be sufficient for gaming cards.
Reading the description on that 12.7 GT/s LP5X, and considering only Samsung makes them, the question is what their yield, volume, required voltage and - most importantly - price-per-GB will look like in practice.
LP6-10667 offers higher bandwidth per channel (equal to hypothetical 16 GT/s LP5X), probably needs less voltage, and will be produced by all 3 memory manufacturers, so probably be cheaper per GB.
So even cutting the interface of AT3 to 75% width and go with 24GB LP6-10667@288bit might still be a better overall solution than 16GB of this "Ultra-Pro" (probably also Ultra-expensive) Samsung-only LP5X-12700.
LP6-10667@288bit would still give 480 GB/s, and 24GB has less risk of the PCIe-interface ever becoming a bottleneck.
Full config AT3 will likely perform around 9070 and has only 8x PCIe, so putting only 16GB on it may actually be risky.
Edit:
I am wondering, what RDNA5's Universal Compression will bring us. If data can be compressed by let's say 1.3x on average, this means net 1.3x more effective DRAM capacity.
Are we sure it works that way?
We've had DeltaColorCompression and internal compression on GPUs with ongoing improvements for over a decade, but it never really reduced VRAM capacity requirements in any noticeable way, only bandwidth efficiency.
The only way it could reduce capacity needs would be if data is stored compressed even in VRAM.