256-bit memory on 2 socket Sandy boards

alyarb

Platinum Member
Jan 25, 2009
2,425
0
76
The ASUS manual is particularly unclear on this point.

If I have two sockets and 4 DIMMs, the ASUS guide wants me to put two DIMMs in the "A and B" banks, and another 2 DIMMs in the "E and F" banks. This comes out looking like 128-bit per CPU, right?

How can each socket get 52 GB/sec of memory bandwidth on only 2 DIMMs per socket?

It would seem we need 8 DIMMs, and not 4, to get the 256-bit bus populated, right?

The only explanation I can think of is if there some stuff being done by QPI to give CPU#2 direct access to the DIMMs on CPU#1? Certainly there are not traces going from socket2 onto socket1's memory banks. In my mind I envision memory bus width to be fixed like any geometry, and there are a fixed amount of wires there...
 
Last edited:

sefsefsefsef

Senior member
Jun 21, 2007
218
1
71
Each DIMM has a 64-bit data bus. I would expect each socket to have a 256-bit memory interface (so 512-bit when counting both sockets), but as it is, your 4 DIMMs are 256-bits "wide."
 

alyarb

Platinum Member
Jan 25, 2009
2,425
0
76
I understand how DIMMs work physically. I'm asking for someone with relevant experience to confirm or deny if you need 4 DIMMs per socket to get the 256-bit bus, or if QPI is capable of unifying remote memory buses for an effective 256-bit system.

"2x 256-bit" memory might be ok for a dual-GPU card, but last time I checked, 2S servers do not duplicate each CPUs memory. It is unified. So, what does that mean for performance? Am I getting 128-bit performance or 256-bit performance with only 2 DIMMs per socket?
 
Last edited:

gbeirn

Senior member
Sep 27, 2005
451
14
81
With socket 2011 you need 4 DIMMs per CPU for maximum bandwidth (4x64bit). You are correct, in your scenario you would need a minimum of 8 DIMMs for max bandwidth (2x4x64bit). If you put 2 per CPU you still end up with the same bandwidth as 4 with one cpu (2x2x64bit).
 

gbeirn

Senior member
Sep 27, 2005
451
14
81
There is some overhead involved with NUMA access (I. E. CPU 0 accessing CPU 1 memory across the QPI link). I am going to assume however that both CPUs need some RAM attached to be detected properly so if you only have 4 DIMMs, put two in each. It would still be effectively 256bit minus whatever overhead from one CPU to the other but 8 DIMMs would be the best.
 

NTMBK

Lifer
Nov 14, 2011
10,480
5,897
136
Yup, you really want to have 8 DIMMs to get maximum bandwidth for that platform.