General remarks
Even though the only theoretical requirement for GPUDirect RDMA to work between a third-party device and an NVIDIA GPU is that they share the same root complex, there exist bugs (mostly in chipsets) causing it to perform badly, or not work at all in certain setups.
We can distinguish between three situations, depending on what is on the path between the GPU and the third-party device:
PCIe switches only
single CPU/IOH
CPU/IOH <-> QPI/HT <-> CPU/IOH
The first situation, where there are only PCIe switches on the path, is optimal and yields the best performance. The second one, where a single CPU/IOH is involved, works, but yields worse performance ( especially peer-to-peer read bandwidth has been shown to be severely limited on some processor architectures ). Finally, the third situation, where the path traverses a QPI/HT link, may be extremely performance-limited or even not work reliably.
Read more at:
http://docs.nvidia.com/cuda/gpudirect-rdma/index.html#ixzz4FiMMaSy7
Follow us: @GPUComputing on Twitter | NVIDIA on Facebook