STREM result for Nehalem-EX from SGI

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
This is a 128-socket, 1024 core Nehalem EX Xeon 7560 system.

http://www.cs.virginia.edu/stream/stream_mail/2010/0006.html

Function Rate (MB/s) Avg time Min time Max time
Copy: 1928142.3878 0.1708 0.1699 0.1746
Scale: 1933831.5239 0.1704 0.1694 0.1732
Add: 2209714.7393 0.2235 0.2224 0.2270
Triad: 2212717.3205 0.2228 0.2221 0.2251

With Triad results getting 2212GB/s, a 2-socket system should be able to get 34GB/s per socket, which is much better than the Dell system Anandtech reviewed. The real results would probably be higher at 2S, if we account for scability losses. That is over 40% more bandwidth than the dell system, and nearly close to the Westmere-EP result.

It's probably very pricey, but the memory performance is definitely not lacking as initial results have shown. The Dell system was somehow not using the platform to the full potential.
 

JFAMD

Senior member
May 16, 2009
565
0
0
2 Socket AMD systems are getting 50-55GB/s in STREAM at a significantly lower price. I have not seen a 2P result on beckton that gets anywhere close to that. The real challenge is that they only support 1066 memory where MC is capable of running 1333 memory.

Westmere is 3 channels of 1333 and beckton is 4 channels of 1066. In all reality, those two should perform in a general range close to each other. What you gain in an extra channel you lose in slower memory and the memory buffers that add latency to the whole thing.
 

VirtualLarry

No Lifer
Aug 25, 2001
56,347
10,048
126
How do they get 1024 cores working together? I thought that APIC maxed out at 255 cores?
 

Voo

Golden Member
Feb 27, 2009
1,684
0
76
How do they get 1024 cores working together? I thought that APIC maxed out at 255 cores?
Skipping through the SW dev. manual from Intel - 10.4.6 and figure 10.6 - there seems to be a x2APIC mode that uses 32bits for the ID instead of 8.

10.12 in the manual describes it shortly.