Wouldn't 3xR9700 32GB be better?
I thought about it, but it didn't tally. I don't require it for LLM inferencing or quantizing very large multimodal/purpose-built models for which I have the Tenstorrent box. Also, the W9700 was still an unknown quantity.
In my case, RDNA4 would only have made sense if I needed INT4 quantization where it is vastly superior. But this build is specifically for rasterized compute on FP16, BF16 workloads and predominantly requires high bandwidth and VRAM for pose estimation, 3D Mocap, and short animated renders with high FPS. 864 GB/s with 48 GB VRAM per card means larger batch sizes, speed and accuracy without compromise/workarounds. Works great with ROCm and Pro Render. It will thump the W9700 in this case.