Intel Nehalem HPC performance

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
http://www.worldscinet.com/ppl/

(It's free to view)

3 systems are compared in 7 major HPC applications.

2P Opteron 8350 Barcelona(2GHz) system, 2-channel DDR2-667 16GB total
2P Xeon X7350Tigerton(2.93GHz) system, 2-channel FBDDR2-667 16GB total
1P Nehalem(2.8GHz) system, 3-channel DDR3-1333 24GB total

WITHOUT Simultaneous Multi-Threading(SMT/HT) enabled, the 8 core, 16 thread Nehalem beats 16 core Barcelona in 6 out of the 7 applications(10 to 80% for a average of 45) and 16 core Tigerton in 5 of the 7(20 to 190% for a average of 70%) applications.

With SMT 3 of the applications achieve significant improvements(10%/22%/52%), 2 achieves minor improvements(0-5%), and 2 of the applications suffer small decreases in performance(less than 5% with 2P). Turbo Mode is not enabled/tested.

Remember the SAP-SD benchmarks?? Well...

Watch out, Nehalem is coming :).
 

Cogman

Lifer
Sep 19, 2000
10,286
147
106
Ummm, watch out Nehalem is here. I could be wrong, but bloomfield is the exact same architecture as nehalem, just less cores, so this review is somewhat pointless.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
Ummm, watch out Nehalem is here. I could be wrong, but bloomfield is the exact same architecture as nehalem, just less cores, so this review is somewhat pointless.

Not really. The only benchmarks out there are PC based benchmarks. We have yet to see server Nehalems. Server world will see what Nehalem can REALLY do. There are people who doubt Nehalem's capability in server(one example is the SAP-SD and SPEC CPU).

In servers, many of new features that are mostly irrevalent in PCs will be very relevant(TLB changes, 2nd level Branch Target Buffer, QPI)

This research shows that such big performance increases is going to be seen in lots of applications.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Comparing a 2.8GHz Nehalem system or a 2.93GHz Tigerton system to a 2.0GHz Barcelona system is about as pointless an exercise as it gets if the objective is to make a comparison between AMD and Intel systems that people will be buying in 2009 when Nehalem Xeon is for sale...

Why not 2.7GHz Shanghai at least? Then you'd generate some data that would be relevant to decision makers. At least make the clockspeed delta a little more real-world relevant and use something clocked a little higher than 2.0GHz for the K10 system.

The data are better than no data, but the comparisons and conclusions regarding what beats what X% of the time in these applications is basically value-less until the dataset is rounded out with some results for 2.7GHz Shanghai.
 

VirtualLarry

No Lifer
Aug 25, 2001
56,587
10,225
126
2P Opteron 8350 Barcelona <-- how is that a 16-core system? Dual quad-cores? That's 8 cores, not 16!

And how is a 1P Nehalem 8 core? Nehalem are 4 core, 8 thread. Not 8 core, 16 thread.
 

jones377

Senior member
May 2, 2004
464
65
91
Originally posted by: VirtualLarry
2P Opteron 8350 Barcelona <-- how is that a 16-core system? Dual quad-cores? That's 8 cores, not 16!

And how is a 1P Nehalem 8 core? Nehalem are 4 core, 8 thread. Not 8 core, 16 thread.

They are comparing dual socket Nehalem (2S/8C/16T) and quad socket Barcelona/Tigerton (4S/16C/16T). Just like in spec_fp_rate and SAP-SD tested elsewhere, the dual socket Nehalem beats the quad socket competition. Quite a feat!
 

JackyP

Member
Nov 2, 2008
66
0
0
Originally posted by: Idontcare
Why not 2.7GHz Shanghai at least? Then you'd generate some data that would be relevant to decision makers. At least make the clockspeed delta a little more real-world relevant and use something clocked a little higher than 2.0GHz for the K10 system.
I think it is obvious why they did not use Shanghai (which launched only very recently), generating all the data most certainly took quite some time. Still why they only used a 2.0ghz K10 escapes my imagination.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Originally posted by: JackyP
Originally posted by: Idontcare
Why not 2.7GHz Shanghai at least? Then you'd generate some data that would be relevant to decision makers. At least make the clockspeed delta a little more real-world relevant and use something clocked a little higher than 2.0GHz for the K10 system.
I think it is obvious why they did not use Shanghai (which launched only very recently), generating all the data most certainly took quite some time. Still why they only used a 2.0ghz K10 escapes my imagination.

The fact they generated data with a yet-to-be released 2P Nehalem Xeon platform suggests they have the time/access/ability to do a proper comparison with a (albeit recently) released server platform based on Shanghai and for some reason chose not to.
 

JackyP

Member
Nov 2, 2008
66
0
0
Oh snap. You are right. Maybe it is just the typically better Intel support for reviewers, they seem to be giving away ES rather easily, but if the testers work at LANL they shouldn't have had much of a problem with acquiring AMD chips too... Ok, I am puzzled.
 

SunnyD

Belgian Waffler
Jan 2, 2001
32,675
146
106
www.neftastic.com
Originally posted by: Idontcare
Comparing a 2.8GHz Nehalem system or a 2.93GHz Tigerton system to a 2.0GHz Barcelona system is about as pointless an exercise as it gets if the objective is to make a comparison between AMD and Intel systems that people will be buying in 2009 when Nehalem Xeon is for sale...

Why not 2.7GHz Shanghai at least? Then you'd generate some data that would be relevant to decision makers. At least make the clockspeed delta a little more real-world relevant and use something clocked a little higher than 2.0GHz for the K10 system.

The data are better than no data, but the comparisons and conclusions regarding what beats what X% of the time in these applications is basically value-less until the dataset is rounded out with some results for 2.7GHz Shanghai.

Shh... don't tell dmens... he'll say clock speed is irrelevant!
 

aigomorla

CPU, Cases&Cooling Mod PC Gaming Mod Elite Member
Super Moderator
Sep 28, 2005
21,117
3,638
126
im just gonna pass by this thread and say

MEEP...

http://www.8anet.com/merchant....765&lastcatid=5&step=4

+

http://i125.photobucket.com/al...igomorla/Capture-1.jpg

+

http://i125.photobucket.com/al...p73/aigomorla/Joke.jpg

Once my board arrives i intend on benching it...

as u can see i have the 3.07ghz one.. :p

yeah i can see IDC going OH NOESSSS another different MHZ...

Sorry i wanted to ruffle his clean hair.. LOL...


And no, a regular i7 cpu will NOT work in that board.. You need a i7 with 2 x QPI enabled.. which is called a gainestown on cpu-z.. or Neha-EP by code..

Originally posted by: Idontcare
Originally posted by: JackyP
Originally posted by: Idontcare
Why not 2.7GHz Shanghai at least? Then you'd generate some data that would be relevant to decision makers. At least make the clockspeed delta a little more real-world relevant and use something clocked a little higher than 2.0GHz for the K10 system.
I think it is obvious why they did not use Shanghai (which launched only very recently), generating all the data most certainly took quite some time. Still why they only used a 2.0ghz K10 escapes my imagination.

The fact they generated data with a yet-to-be released 2P Nehalem Xeon platform suggests they have the time/access/ability to do a proper comparison with a (albeit recently) released server platform based on Shanghai and for some reason chose not to.

Nah.... the launch on those chips i believe was today... or so i heard... list price is 1400 dollars per cpu. LOL

so thats 2800 dollars + that 500 dollar board + 6 sticks of DDR3 ram... you guys see now why i give enterprise people a lot of respect!

The ones i have are until march.. :X
 

dmens

Platinum Member
Mar 18, 2005
2,275
965
136
Originally posted by: SunnyD
Originally posted by: Idontcare
Comparing a 2.8GHz Nehalem system or a 2.93GHz Tigerton system to a 2.0GHz Barcelona system is about as pointless an exercise as it gets if the objective is to make a comparison between AMD and Intel systems that people will be buying in 2009 when Nehalem Xeon is for sale...

Why not 2.7GHz Shanghai at least? Then you'd generate some data that would be relevant to decision makers. At least make the clockspeed delta a little more real-world relevant and use something clocked a little higher than 2.0GHz for the K10 system.

The data are better than no data, but the comparisons and conclusions regarding what beats what X% of the time in these applications is basically value-less until the dataset is rounded out with some results for 2.7GHz Shanghai.

Shh... don't tell dmens... he'll say clock speed is irrelevant!

i can see explaining basic performance concepts to some is a futile effort.