Barcelona Single Thread SPEC CPU2006 scores

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
Originally posted by: Viditor
Originally posted by: Phynaz
Originally posted by: Viditor

4. As for clockspeed, I would be very surprised if we didn't have a 2.8 GHz Phenom by the end of this year. While they've already demonstrated a 3 GHz in June, that was also before they improved the steppings...JMHO


Not according to This.

VR-Zone are the same ones that assured us that we would have 4 GHz Wolfdales and 3.73 GHz Yorkfields last quarter...VR-Zone roadmap...

I don't really see them as a credible source.

I'm betting we would have those chips if AMD hadn't flubbed Barcelona so very badly.

 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
Finally, the difference in both the new rev and unbuffered memory will come into play. While I agree that each of these things is only a few percentage points, as you add them up it's a significant amount!

Your arguments, are flawed:

-Unbuffered to Buffered memory will give advantage:

And Intel won't benefit by going from FB-DIMM to unbuffered DDR2/3 DIMMS?? Gimme a break. It's likely Intel will gain even more, as we seen that Core 2 Duos are far more competitive to Athlon X2s than Woodcrest was to Opterons.

Conroe to Woodcrest: Goes from the chipset which is stability and sustained bandwidth(useful for servers/workstation) optimized chipset from Blackford, to peak bandwidth optimized(which is useful for desktop), non FB-DIMM chipset with ones like P35 which will gain far more than buffered to unbuffered memory will ever give you.

-Faster HT3 bus will give performance advantage in desktop:

Keep making me laugh, but such dreams never materialized and never will: http://www.anandtech.com/showdoc.aspx?i=2046&p=7

PCs don't care about hypertransport in terms of performance, multi-processor servers do.

-Revision will make a significance difference:

Simply stated, no. What makes you think minor bug fixes are gonna do much as architectural changes?? Kentsfield B3 to G0: http://www.anandtech.com/cpuch...howdoc.aspx?i=3066&p=5

P35 C1 vs C2: http://www.anandtech.com/cpuch...ts/showdoc.aspx?i=2851

As you can see, stepping don't offer performance increases, architectural changes and clock speeds do.

-More bandwidth will offer performance advantages:

LOL. Seriously. Fanboys switch their argument sides as they see it convenient. I thought I heard that ever since AMD integrated their memory controller it decreased the importance of memory bandwidth. Look what AM2 gave in performance. Now they say memory bandwidth increase in Barcelona will offer substantial performance increase. You are just as gullible as those guys who thought AM2 will offer anything more than few %.


(I am not meaning this SPEC benchmark is conclusive how Barcelona will perform. However, preliminary benchmarks from reviewers like Anandtech and Tech Report(especially Tech Report) are making it clear. This SPEC benchmark just solidifies the belief that Phenom won't dethrone Intel for top performance)
 

Viditor

Diamond Member
Oct 25, 1999
3,290
0
0
Originally posted by: IntelUser2000
There was an 8% difference between HE and SE (standard edition) at the same clockspeed, and remember that Phenom will be the extreme edition (so another level up again).
As to HT3 on the desktop environment, I expect it to make a difference in I/O...especially as we move towards CTM based graphics. Things like AMD's upcoming "Triple-Play" should make extensive use of more bandwidth (though I agree that it's been of minimal effect in the past).

Finally, the difference in both the new rev and unbuffered memory will come into play. While I agree that each of these things is only a few percentage points, as you add them up it's a significant amount!
If you add 5% for the new stepping, 8% for the extreme core (clock for clock), and probably 6-9% for the memory and HT3 combined...that's a 19-22% improvement. I don't know about you, but that's fairly significant to me...

Since you like to do EXTREME CHERRY-PICKING, let me do mine :).

http://www.anandtech.com/IT/showdoc.aspx?i=3039&p=11

That page shows that SE version is faster than HE. So??

Also:

http://www.anandtech.com/IT/showdoc.aspx?i=3039&p=9

So, FIXED:

If you add -5% for the new stepping, -8% for the extreme core (clock for clock), and probably -6-9% for the memory and HT3 combined...that's a -19-22% improvement. I don't know about you, but that's fairly significant performance deficit to me...

VR-Zone are the same ones that assured us that we would have 4 GHz Wolfdales and 3.73 GHz Yorkfields last quarter...VR-Zone roadmap...

Either you are too gullible or do not realize the fact that Intel doesn't need 3.5-4.0GHz parts to beat AMD performance wise. It looks like they won't need it for the near future either(phenom launch date included). Schedules change.

I didn't cherry pick, I chose the only performance benchmark in the article. If you actually went back and read the test definitions, you'd see that the Scalability tests that you point out (while they do have very pretty graphs) don't really reflect chip performance and are a subset of the AS3AP benches I posted...
Scalability measures a combination of the CPU, network, and disk under varying loads...

By your last comment I take it that you are a firm believer that Intel could have released 4 GHz Wolfdales and 3.73 GHz Yorkshires last quarter?
Let's see...a 3 GHz Yorkfield has a TDP of 130w. Were you expecting a 155w Yorkfield available 4-6 months before Intel's published roadmap?

I'm certainly familiar with the concept of holding off on releasing higher clocked parts for the sake of marketing...both Intel (with C2D) and AMD (with X2) have done this many times in the past. But to catagorically state that any company would be able to launch a processer with speeds so much higher within 3 quarters of tape-out is ludicrous!
This is especially true as the power levels on Penryn are so close to Conroe (5% differential?).
 

Viditor

Diamond Member
Oct 25, 1999
3,290
0
0
Originally posted by: IntelUser2000
Finally, the difference in both the new rev and unbuffered memory will come into play. While I agree that each of these things is only a few percentage points, as you add them up it's a significant amount!

Your arguments, are flawed:

-Unbuffered to Buffered memory will give advantage:

And Intel won't benefit by going from FB-DIMM to unbuffered DDR2/3 DIMMS?? Gimme a break. It's likely Intel will gain even more, as we seen that Core 2 Duos are far more competitive to Athlon X2s than Woodcrest was to Opterons.

You've lost me...where did FBD's come into this???
I think you've lost the point entirely...let me refresh your memory.
The point was an attempt to extrapolate Phenom's performance from a review of a pre-release Barcelona or from a single-threaded Spec bench of a 1.9 GHz lower power Barcelona. What in the world does Intel have to do with that at all??




-Faster HT3 bus will give performance advantage in desktop:

Keep making me laugh, but such dreams never materialized and never will: http://www.anandtech.com/showdoc.aspx?i=2046&p=7

PCs don't care about hypertransport in terms of performance, multi-processor servers do.


Again, I don't see the relevance here...you might as well pull out some old Piii benches to show how C2D will perform. If you actually read what I wrote, I was talking about how it will effect I/O on the K10 with things like TriplePlay. Now if you have some benches for THAT, I'd dearly love to see them!


-Revision will make a significance difference:

Simply stated, no. What makes you think minor bug fixes are gonna do much as architectural changes?? Kentsfield B3 to G0: http://www.anandtech.com/cpuch...howdoc.aspx?i=3066&p=5

P35 C1 vs C2: http://www.anandtech.com/cpuch...ts/showdoc.aspx?i=2851

As you can see, stepping don't offer performance increases, architectural changes and clock speeds do.

I see...so you feel that if changes in Intel's steppings on a chip didn't make a difference, then that MUST be true for Barcelona...no matter WHAT the errata fix was. I know this will come as a shock, but not all new steppings fix the same things...:)
Some revisions can actually fix something that very much improves performance.


-More bandwidth will offer performance advantages:

LOL. Seriously. Fanboys switch their argument sides as they see it convenient. I thought I heard that ever since AMD integrated their memory controller it decreased the importance of memory bandwidth. Look what AM2 gave in performance. Now they say memory bandwidth increase in Barcelona will offer substantial performance increase. You are just as gullible as those guys who thought AM2 will offer anything more than few %.

:) I find it amusing to be called a "fanboy" by someone whose handle is "IntelUser".
But I digress...
1. That's not a quote from me...what I said was "Things like AMD's upcoming "Triple-Play" should make extensive use of more bandwidth (though I agree that it's been of minimal effect in the past)"
2. I was speaking of I/O bandwidth


(I am not meaning this SPEC benchmark is conclusive how Barcelona will perform. However, preliminary benchmarks from reviewers like Anandtech and Tech Report(especially Tech Report) are making it clear. This SPEC benchmark just solidifies the belief that Phenom won't dethrone Intel for top performance)

Again, this is the type of fuzzy logic that led early man to believe that the universe revolved around the Earth...there are just far too many differences to draw that conclusion!
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
I didn't cherry pick, I chose the only performance benchmark in the article. If you actually went back and read the test definitions, you'd see that the Scalability tests that you point out (while they do have very pretty graphs) don't really reflect chip performance and are a subset of the AS3AP benches I posted...
Scalability measures a combination of the CPU, network, and disk under varying loads...
(Since I started the thread for single thread performance, which reflects majority of PC performance, and therefore Phenom, STAY ON TOPIC)

Man, anyone believing HE is faster than SE(and for those of you that needs an explanation, I am talking about per clock) is an idiot. Especially since this thread was brought up for Phenom, since single thread benchmarks foretell what will happen for desktop. Especially when the numbers quoted are ridiculous like 8%.

By your last comment I take it that you are a firm believer that Intel could have released 4 GHz Wolfdales and 3.73 GHz Yorkshires last quarter?
Let's see...a 3 GHz Yorkfield has a TDP of 130w. Were you expecting a 155w Yorkfield available 4-6 months before Intel's published roadmap?

Sure, Core 2 Duo CPUs from 1.86GHz to 3.0GHz has same TDP, so?? There was no need for Intel to make faster Core 2, at least on the desktop, therefore it stayed at 3GHz.

I know this will come as a shock, but not all new steppings fix the same things...
Some revisions can actually fix something that very much improves performance.

LOL, you are hoping for a miracle that won't happen. We all know what happened to the much-delayed, hyped products that the company didn't want to announce until the last week before its official release.

Geforce 5800
Prescott
AM2
R600

AMD, said 800MHz Athlon 64 would be equal to Pentium 4 at 1.6GHz. Not many people believed it. They also showed SPEC benchmarks of all degree, specint, specint_rate, specfp, specfp_rate. AMD knew Athlon 64 will outperform P4 at the clock speed it will be introduced at.

Now there is a reason that Barcelona was only shown for specfp_rate benchmarks. Because they know that all other benchmarks would look bad. That is also the reason the desktop counterpart, Phenom's performance is still not shown, because its not performing well as the hype says.

Again, this is the type of fuzzy logic that led early man to believe that the universe revolved around the Earth...there are just far too many differences to draw that conclusion!

Fuzzy logic you say!!! Look here: http://www.anandtech.com/cpuch...howdoc.aspx?i=3092&p=5

then here:
http://www.anandtech.com/cpuch...owdoc.aspx?i=3038&p=15

Does that tell you anything?? PC app benchmarks for Barcelona are overinflated because Opteron is using dual socket. And FX-7x platform loses performance in games tested for Barcelona PC benchmark.

Using your logic it would have been assumed the reason stars in the sky looked small is because that was the actual size, not because they were far away :).



Basically, I am saying Barcelona-based cores will suck in single-thread. Which means it will suck in PC because majority of apps don't gain from bandwidth and interconnect advantages Barcelona(or AMD for that matter) have over Merom-based cores. Even on the supposedly multi-threaded games/apps, it seems Phenom will be under Core 2 in IPC.

Of course Barcelona will circle around Harpertown, since that's the biggest advancement Barcelona had over K8.
 

Viditor

Diamond Member
Oct 25, 1999
3,290
0
0
Originally posted by: IntelUser2000
I didn't cherry pick, I chose the only performance benchmark in the article. If you actually went back and read the test definitions, you'd see that the Scalability tests that you point out (while they do have very pretty graphs) don't really reflect chip performance and are a subset of the AS3AP benches I posted...
Scalability measures a combination of the CPU, network, and disk under varying loads...
(Since I started the thread for single thread performance, which reflects majority of PC performance, and therefore Phenom, STAY ON TOPIC)

It's difficult when you quote things out of context...


Man, anyone believing HE is faster than SE(and for those of you that needs an explanation, I am talking about per clock) is an idiot. Especially since this thread was brought up for Phenom, since single thread benchmarks foretell what will happen for desktop. Especially when the numbers quoted are ridiculous like 8%.


Then your issue is with Johan and AT, not with me...I was merely linking and quoting a relevant benchmark. You should let Johan know what you think though...
The reason the issue came up (which you should know from the posts you've been paraphrasing) was because of the misbegotten belief that it's possible to extrapolate Phenom's performance from that single thread bench on a 1.9 GHz, BA rev, server platform.
I think the biggest phallacy you posted there though was "single thread benchmarks foretell what will happen for desktop". While most desktop apps are single thread, the environment in which they perform is not...if it were, then multi-core would have absolutely no performance gain for the desktop.


By your last comment I take it that you are a firm believer that Intel could have released 4 GHz Wolfdales and 3.73 GHz Yorkshires last quarter?
Let's see...a 3 GHz Yorkfield has a TDP of 130w. Were you expecting a 155w Yorkfield available 4-6 months before Intel's published roadmap?

Sure, Core 2 Duo CPUs from 1.86GHz to 3.0GHz has same TDP, so?? There was no need for Intel to make faster Core 2, at least on the desktop, therefore it stayed at 3GHz.


Then I shall ridicule your religion no longer...if that's what you believe, then mazzeltoff.


I know this will come as a shock, but not all new steppings fix the same things...
Some revisions can actually fix something that very much improves performance.

LOL, you are hoping for a miracle that won't happen. We all know what happened to the much-delayed, hyped products that the company didn't want to announce until the last week before its official release.

I am not hoping for anything (well, not in my writing here anyway). I am stating categorically that prediction is impossible without all of the facts! Again, if this offends your belief system, I apologize...


Now there is a reason that Barcelona was only shown for specfp_rate benchmarks. Because they know that all other benchmarks would look bad. That is also the reason the desktop counterpart, Phenom's performance is still not shown, because its not performing well as the hype says.


Again, more supposition without fact (in fact, most of the facts say otherwise). If this makes you happy, good luck to you...


Again, this is the type of fuzzy logic that led early man to believe that the universe revolved around the Earth...there are just far too many differences to draw that conclusion!

Fuzzy logic you say!!! Look here: http://www.anandtech.com/cpuch...howdoc.aspx?i=3092&p=5

then here:
http://www.anandtech.com/cpuch...owdoc.aspx?i=3038&p=15

Does that tell you anything?? PC app benchmarks for Barcelona are overinflated because Opteron is using dual socket. And FX-7x platform loses performance in games tested for Barcelona PC benchmark.

Using your logic it would have been assumed the reason stars in the sky looked small is because that was the actual size, not because they were far away :).


Now this is even farther afield...why you are comparing the FX-7x benches? What does that have to do with Phenom (except the number of cores on the motherboard)?
I would assume that comparing Piii benches to A64 would give me equally meaningless results, even though all of the current and near-future offerings stem from those designs...
 

myocardia

Diamond Member
Jun 21, 2003
9,291
30
91
Originally posted by: Viditor
I am saying that SPECint and SPECfp are not real-world desktop benchmarks and don't reflect anything to do with the desktop world...what are YOU saying? :)
Trying to extrapolate Phenom's performance from a single-threaded Spec bench is an effort in futility...

Please point out the post in this thread where I mentioned Phenom. Why would I mention Phenom? Is it available for sale yet, anywhere on earth? Yeah, I didn't think so.
This thread is about Barcelona, and how poorly it's performing. Please don't try to change the subject, once your point has been proven wrong.

For one thing, the desktop environment is always running other threads in the background at the same time (which is the main reason for multi-core in the first place).

That's not really a point though...it's just an insult without anything to back it up.
You should at the very least point out where you think I was wrong...

Umm, Viditor, pointing out when someone is wrong is no insult. You should do some research. Really.

Wow...you mean all of those Penryn's are on sale now??? Where can I get one? :)
/sarcasm

Why do you keep bringing up processors that haven't even been released yet? This thread is about a processor that's readily available, and according to all of the benchmarks I've seen so far, barely outperforms a single-core Athlon 64, at least in single-threaded apps.

We are talking about chips that are due out soon...do you also think that Intel will be unable to fulfill their promised Penryn shipping, or is that feeling only reserved for AMD?
Don't you think that is quite fanboyistic (can that be a word?)? ;)

No, we aren't. You keep trying to bring up processors from the future, since your current object of affection seems to have fallen flat on it's face, but this thread is about a processor that's available today.

Oh, and did you just call me a fanboy? That's hilarious, because I've never in my life been called an Intel fanboy. Since you haven't been able to find it all by yourself, here's a link to my systems: A64 4000 & Opteron 170. Would you like to take a guess as to the last Intel processor I owned, besides the quad-core that I just bought, and haven't even assembled yet? It was a dual Celeron 366 @ 550 Mhz system, that I replaced with a 600 Mhz Athlon. So, I guess I am a fanboy, of the AMD variety, that is.:D

Are you saying that AMD won't top 2 GHz? Till when (in your estimation)?

No, I'm saying that anyone claiming we should expect a 50% clockspeed increase, with exactly the same microarchecture, is clearly more fanboy than I'll ever be, that's all.
 

Viditor

Diamond Member
Oct 25, 1999
3,290
0
0
Originally posted by: myocardia
Originally posted by: Viditor
I am saying that SPECint and SPECfp are not real-world desktop benchmarks and don't reflect anything to do with the desktop world...what are YOU saying? :)
Trying to extrapolate Phenom's performance from a single-threaded Spec bench is an effort in futility...

Please point out the post in this thread where I mentioned Phenom. Why would I mention Phenom? Is it available for sale yet, anywhere on earth? Yeah, I didn't think so.

Sure thing...on the previous page of this thread I wrote:
"The problem I see is that trying to extrapolate Phenom performance from this is close to impossible"

Your reply was:
"How is that, did AMD completely redesign the architecture, just for Phenom? I heard they didn't, that it's gonna be just like the Opteron/Athlon 64, and Opteron/X2 performance comparisons"


This thread is about Barcelona, and how poorly it's performing. Please don't try to change the subject, once your point has been proven wrong.

LOL...:)
This thread is about a specific Barcelona performance on a single threaded app...I think that even those who disagree with me will tell you that this has very little to do with Barcelona's performance as a server chip (which is what it is). As IntelUser said:
"Of course Barcelona will circle around Harpertown, since that's the biggest advancement Barcelona had over K8"

So, in point of fact it's an attempt to extrapolate Phenom's performance and not Barcelona's.

For one thing, the desktop environment is always running other threads in the background at the same time (which is the main reason for multi-core in the first place).

That's not really a point though...it's just an insult without anything to back it up.
You should at the very least point out where you think I was wrong...

Umm, Viditor, pointing out when someone is wrong is no insult. You should do some research. Really.

Sigh...Myo, I research 4-6 hours/day, and have done so for at least 45 years now. To suggest that I don't I find insulting.
More to the point, you didn't point out where I was wrong and instead just made a flip comment about my needing to do more research.
If you feel that multi-cores are useful in any way when only dealing with a single thread, then PLEASE tell us all how that is so!


Wow...you mean all of those Penryn's are on sale now??? Where can I get one? :)
/sarcasm

Why do you keep bringing up processors that haven't even been released yet? This thread is about a processor that's readily available, and according to all of the benchmarks I've seen so far, barely outperforms a single-core Athlon 64, at least in single-threaded apps.

Not in single threaded apps (this is what I was trying to tell you before)...in a single threaded synthetic benchmark. They are absolutely NOT the same thing...!
Even benchmarking a single-threaded app in Windows means that you have a great deal of other threads occuring while you bench.
Now if you're comparing this same benchmark against others in the same environment, fair enough...but servers never ever run in this environment in the real-world, and neither do desktops (though often controller chips do).
I agree with you that we can't know about chips not released yet, in fact that has been my whole point throughout this entire thread! But I'm not the one who brought it up...I believe that Harpoon was first.


 

jones377

Senior member
May 2, 2004
463
64
91
I know this will be impossible for some people to understand, but SPEC CPU2006 is not a synthetic benchmark. It uses slightly modified real applications where you first compile the C, C++ or Fortran source code into an executable suitable for the ISA/OS you are running in. If you believe that is a synthetic benchmark, then you must also believe that all apps, codecs etc for Linux, where you download a source code instead of an executable and compile it with GCC, are synthetic programs.

Synthetic benchmarks are those that simulate actual programs by executing the same type of instructions in simular ratios but do no real work. Some examples of real synthetic benchmarks are those ALU/FPU tests in Sandra and simular programs.

Unfortunately this term has been misused by people who want to trivialize a benchmark because they don't like or understand the results it gives out.
 

myocardia

Diamond Member
Jun 21, 2003
9,291
30
91
It won't do any good to explain to Viditor that of all the benchmarks in the world, SPEC CPU 2006 is more or less the antithesis of a synthetic benchmark. People who aren't happy with the results of any benchmark ever performed always act that way. Always. Thanks for clarifying, in such a readable way, though. Some of us actually appreciate it.
 

Viditor

Diamond Member
Oct 25, 1999
3,290
0
0
Originally posted by: jones377
I know this will be impossible for some people to understand, but SPEC CPU2006 is not a synthetic benchmark. It uses slightly modified real applications where you first compile the C, C++ or Fortran source code into an executable suitable for the ISA/OS you are running in. If you believe that is a synthetic benchmark, then you must also believe that all apps, codecs etc for Linux, where you download a source code instead of an executable and compile it with GCC, are synthetic programs.

Synthetic benchmarks are those that simulate actual programs by executing the same type of instructions in simular ratios but do no real work. Some examples of real synthetic benchmarks are those ALU/FPU tests in Sandra and simular programs.

Unfortunately this term has been misused by people who want to trivialize a benchmark because they don't like or understand the results it gives out.

Very good info on the makeup of Spec there jones...thanks!
Also, thanks for the comment on the PGI compiler...I've been reading as much as I can on that lately, quite fascinating!
Of interest to others might be the fact that the faster of the Spec scores you listed (using the PGI compiler) was also on the fewer cores (2 cores while the slower ones were 4 cores).
Of course this is hardly surprising for a single threaded bench, but interesting nonetheless.
PGI recently put out a bliurb on the new compilers...

"The Portland Group has announced that its C/C++ and Fortran PGI compilers, application debugging, and performance-profiling tools support code targeting Quad-Core AMD Opteron processors. All PGI compilers support generation of PGI Unified Binary executables, which lets developers leverage the processors from both AMD and Intel while treating x64 as a single platform, eliminating the need to target and optimize for two separate processor platforms"

As to the SPEC benchmark, let me be clear that my intention wasn't to denegrate the value of SPECint as a bench, but to point out that extrapolating Phenom desktop performance (even in single-threaded apps) from the benching of Barcy on SPECint was specious. In other words, it's more that the benchmark environment when generating SPEC is synthetic or non-real-world (was I clear with that?).

Edit: Myo, I really can speak for myself...you don't need to put words in my mouth (though I'm sure it felt good to rant for a bit there...).
 

jones377

Senior member
May 2, 2004
463
64
91
I wasn't really targetting you Viditor. I guess I have just seen the "SPEC is a synthetic benchmark" argument one too many times...

I think we can get some idea of the desktop performance of Phenom by comparing the singlethread SPECint performance of Barcelona and K8 if both are using the same compiler and memory configuration. Unfortunately there is only one submission for Barcelona so far using the PGI compiler while K8 is using mostly the PathScale compiler. That's why even I expressed some doubt about this Barcelona score in my post above. It's more of a glimpse than indication if anything.

I did a further breakdown of the SPECint score by looking at the benchmark components (SPECint is after all not one benchmark but twelve different benchmarks compiled into a single score using the geometric mean.). The comparison is between K8/2GHz and K10/1.9GHz using the same DDR2 667 CL5 ECC/REG but different compilers. The last 2 columns shows the K10 relative performance to the K8 in base and peak scores.

K8/base K8/peak K10/base K10/peak base peak
400.perlbench 9.29 11.40 9.22 11.30 0.99 0.99
401.bzip2 8.10 8.15 7.67 7.91 0.95 0.97
403.gcc 8.02 9.02 7.25 8.54 0.90 0.95
429.mfc 9.30 11.60 9.80 13.60 1.05 1.17
445.gobmk 10.30 12.20 8.49 10.40 0.82 0.85
456.hmmer 9.93 13.50 12.50 14.80 1.26 1.10
458.sjeng 9.45 10.20 8.42 9.43 0.89 0.92
462.libquantum 12.70 12.70 22.70 24.70 1.79 1.94
464.h264ref 15.50 16.00 14.70 14.90 0.95 0.93
471.omnetpp 8.26 8.27 8.81 8.81 1.07 1.07
473.astar 7.51 7.51 7.77 7.77 1.03 1.03
483.xalancbmk 11.40 11.60 9.35 10.90 0.82 0.94

Even taking account for Barcelona running 5% slower in clockspeed, some benchmarks show Barcelona running slower per clock than K8. Others run faster with 2 of them much faster. This is most likely due to the different compiler used because I can't imagine that K10 running anything slower per clock than K8. At worst it should be at about parity.

So I looked at the difference between the PGI and PathScale compiler, with both running on an Opteron 2222 3GHz.

PS/base PS/peak PGI/base PGI/peak base peak
400.perlbench 13.50 16.30 12.90 16.10 0.96 0.99
401.bzip2 11.00 11.10 11.40 11.60 1.04 1.05
403.gcc 10.60 12.20 10.60 11.90 1.00 0.98
429.mfc 11.80 14.80 14.60 18.70 1.24 1.26
445.gobmk 15.30 18.00 14.90 16.10 0.97 0.89
456.hmmer 14.80 20.10 19.80 21.00 1.34 1.04
458.sjeng 13.80 15.00 14.10 14.60 1.02 0.97
462.libquantum 16.20 16.40 27.00 27.00 1.67 1.65
464.h264ref 23.00 23.20 21.60 21.60 0.94 0.93
471.omnetpp 10.30 10.40 10.10 10.10 0.98 0.97
473.astar 10.30 10.30 10.50 10.50 1.02 1.02
483.xalancbmk 16.20 16.30 14.80 14.80 0.91 0.91

Last 2 columns shows the difference between the compilers in base and peak. PGI is faster overall but it's still slower in a few of the tests.

EDIT: the columns look like crap, I'll try to fix it.
 

Viditor

Diamond Member
Oct 25, 1999
3,290
0
0
Originally posted by: jones377
I wasn't really targetting you Viditor. I guess I have just seen the "SPEC is a synthetic benchmark" argument one too many times...

No worries, mate...I kinda figured that by the way you wrote it (I too have those "pet peeves").

Many, many thanks for the effort! I tried to fix the columns too but to no avail, so I just threw it into Excel...
From what I see in your tables, it looks like where PS is faster, the difference is minimal...but where PGI is faster, it's often a fairly large difference (at least on the K8 comparison).

I agree with you that K10 being slower than K8 is just wrong (parity at worst). But I wonder if the compiler is the only possible answer for this?
For example with the 445.gobmk values, even taking into account the clockspeed difference, the K10/K8 base differential seems far greater than the one for K8 between PS/PGI...and the base on 458.sjeng seems well out of whack as well.

Can you think of what else this might be?

Again, thanks for the help!
 

jones377

Senior member
May 2, 2004
463
64
91
I have no explanation for 458.sjeng. It's a chess AI program, which should benefit from the improved branch predictor in Barcelona. There are some differences in compiler flags overall between K8/PGI and K10 but at the very least K10 is compiled with the "-tp barcelona-64" flag. My guess is that they need more time to play around with the compiler flags to get better results for Barcelona in SPECint. 2% higher score with a 5% clock deficit simply isn't good enough and I do believe this will improve over time.

Here is a good example of a heroic use of compiler flags to get a better score, at least in peak.
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
Originally posted by: Viditor
Originally posted by: Phynaz
Originally posted by: Viditor

4. As for clockspeed, I would be very surprised if we didn't have a 2.8 GHz Phenom by the end of this year. While they've already demonstrated a 3 GHz in June, that was also before they improved the steppings...JMHO


Not according to This.

VR-Zone are the same ones that assured us that we would have 4 GHz Wolfdales and 3.73 GHz Yorkfields last quarter...VR-Zone roadmap...

I don't really see them as a credible source.


According to Dailytech, VR-Zone is right.
Not a 2.8ghz Phenom in sight.
 

SexyK

Golden Member
Jul 30, 2001
1,343
4
76
Originally posted by: Phynaz
Originally posted by: Viditor
Originally posted by: Phynaz
Originally posted by: Viditor

4. As for clockspeed, I would be very surprised if we didn't have a 2.8 GHz Phenom by the end of this year. While they've already demonstrated a 3 GHz in June, that was also before they improved the steppings...JMHO


Not according to This.

VR-Zone are the same ones that assured us that we would have 4 GHz Wolfdales and 3.73 GHz Yorkfields last quarter...VR-Zone roadmap...

I don't really see them as a credible source.


According to Dailytech, VR-Zone is right.
Not a 2.8ghz Phenom in sight.

Funny how quiet the thread got once some legit facts were introduced...

Anyway, the TDPs suggest it's going to be tough for AMD to reach 3.0GHz in the near future. I was hoping the 2.0-2.6 range would all come in below 100W TDP leaving some headroom for OCing and higher clocked parts, but the leap up from 89W to 125W from 2.4 to 2.6 is disappointing IMO. They will probably try to push out a 2.8GHz chip at some point in Q108 (FX-82?), but it may be a fire-breather. Otherwise it seems we're looking at Q2 or maybe even H2 '08 for 3.0GHz+.
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
Yeah, there are many people around that will quickly abandon a thread rather than admit they were wrong.
 

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
Lets just wait till Phenom is out. Than and only tham well the thread Hypers be put to bed . Until we start talking about Bulldozer.


Read at the inquirer today . I won't link but Nehalem is suppose to be a greater performance increase than C2D was to P4. Now that even I doubt. But if true man AMD is screwed.