• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

STREM result for Nehalem-EX from SGI

IntelUser2000

Elite Member
This is a 128-socket, 1024 core Nehalem EX Xeon 7560 system.

http://www.cs.virginia.edu/stream/stream_mail/2010/0006.html

Function Rate (MB/s) Avg time Min time Max time
Copy: 1928142.3878 0.1708 0.1699 0.1746
Scale: 1933831.5239 0.1704 0.1694 0.1732
Add: 2209714.7393 0.2235 0.2224 0.2270
Triad: 2212717.3205 0.2228 0.2221 0.2251

With Triad results getting 2212GB/s, a 2-socket system should be able to get 34GB/s per socket, which is much better than the Dell system Anandtech reviewed. The real results would probably be higher at 2S, if we account for scability losses. That is over 40% more bandwidth than the dell system, and nearly close to the Westmere-EP result.

It's probably very pricey, but the memory performance is definitely not lacking as initial results have shown. The Dell system was somehow not using the platform to the full potential.
 
2 Socket AMD systems are getting 50-55GB/s in STREAM at a significantly lower price. I have not seen a 2P result on beckton that gets anywhere close to that. The real challenge is that they only support 1066 memory where MC is capable of running 1333 memory.

Westmere is 3 channels of 1333 and beckton is 4 channels of 1066. In all reality, those two should perform in a general range close to each other. What you gain in an extra channel you lose in slower memory and the memory buffers that add latency to the whole thing.
 
How do they get 1024 cores working together? I thought that APIC maxed out at 255 cores?
Skipping through the SW dev. manual from Intel - 10.4.6 and figure 10.6 - there seems to be a x2APIC mode that uses 32bits for the ID instead of 8.

10.12 in the manual describes it shortly.
 
Back
Top