First *real-world* benches of Barcelona?

Page 6 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Regs

Lifer
Aug 9, 2002
16,666
21
81
Thank you Gary for the added information. You guys demonstrate excellent business practices.
 

classy

Lifer
Oct 12, 1999
15,219
1
81
Originally posted by: keysplayr2003
Originally posted by: JEDIYoda
I see all the closet Amd supporters are out in force..lol

Yes. The CPU forum will be getting very, very ugly the next few weeks. I'll have my work cut out for me.

sigh
This is a mod call out ... there will be no more of this and no further warnings

CPU moderator apoppin
 

Keysplayr

Elite Member
Jan 16, 2003
21,219
56
91
Originally posted by: classy
Originally posted by: keysplayr2003
Originally posted by: JEDIYoda
I see all the closet Amd supporters are out in force..lol

Yes. The CPU forum will be getting very, very ugly the next few weeks. I'll have my work cut out for me.

You'll be the main culprit, you might have to ban yourself :wine:

Is that right? Well, we'll just have to see now won't we, Classy?
 

ltcommanderdata

Junior Member
Oct 28, 2005
4
0
61
Originally posted by: BitByBit
None of that makes any sense.

1. There is no 'pressure' within a processor.

2. AMD's three-issue/retire architecture is not a bottleneck - there are far more factors that determine performance than issue rate alone. Beyond three issue, there are diminishing returns due to instruction parallelism.

2. Core 2 can decode more instructions per clock, but it still only has three ALUs and three FPUs - the same as K10.

3. K10 has double the intruction fetch bandwidth over Core 2, meaning its decoders are more likely to be fully utilised when executing large instructions.

4. I fail to see how you can compare or attempt to draw any parallels between the fuel/air ratio of an engine and the issue rate of a processor. The mind boggles.
Well, Merom actually has 3 ALUs, 2 FPUs, and 3 SSE units, although I believe the FPUs and SSE units share some resources so are not completely exclusive. In terms of K10 fetching instructions in 32-byte blocks, Merom is actually a little better than K8's 16-byte fetch in that it also has a 64-byte buffer that stores previous instructions where it can fetch in 32-byte blocks, useful for repetitive instructions although still not as good as K10's pure 32-byte fetch. Still when things don't fit neatly in the buffer, I do agree that Merom will have difficulty fully utilizing it's 4 decoders and issuing it's up to 5 instruction (including macro-op fusion) potential.
 

formulav8

Diamond Member
Sep 18, 2000
7,004
523
126
Originally posted by: classy
Originally posted by: keysplayr2003
Originally posted by: JEDIYoda
I see all the closet Amd supporters are out in force..lol

Yes. The CPU forum will be getting very, very ugly the next few weeks. I'll have my work cut out for me.

You'll be the main culprit, you might have to ban yourself :wine:


Yuck Yuck Yuck :D

No it isn't funny ... mod "call outs" are not tolerated.
CPU moderator apoppin
 

Ajay

Lifer
Jan 8, 2001
16,094
8,115
136
Originally posted by: Viditor
Isn't it in fact the B01 stepping . That will be released on the 10 of sept. But your right its just a 1 stepping
No...they are releasing stepping B2 or B3 (I believe B3). B1 was from April...
This is the reason for the 6 month delay, they've created at least 2 more steppings since then.

B2 will be the stepping shipping, B3 is at least 2 months out, IIRC.
 

Ajay

Lifer
Jan 8, 2001
16,094
8,115
136
Originally posted by: Nemesis 1
I agree with you 100% Gary. So would it be safe to assume that A0 steppings of Penryn on a bran new process type and logic as well as softerware improvements in Penryn stand to gain as well? Beings how its AO stepping I would say large improvements. Or do you think intel showed their whole hand in this poker game.

Gains from whatever the the first released stepping of Penryn will likely be less significant (% wise) because the Intel isn't having any big problems with their 45nm process and the chipset infrastructure will be more mature and feature complete complete at release. Still, the clock advantage Intel will have (especially for overclockers) will likely dominate AMD's offerings for most applications :(

 

PlasmaBomb

Lifer
Nov 19, 2004
11,636
2
81
Originally posted by: BitByBit
Originally posted by: PlasmaBomb
Higher RPM doesn't lead to higher compression.

No, I just made it up.

"...This makes the maintenance of smoothly increasing RPM far harder with turbochargers than with belt-driven superchargers which apply boost in direct proportion to the engine RPM."
Source

Clearly.

Sorry that reply was quite brief. Boost is different to compression in the engine so we may be talking cross purposes. If by compression in the first post you meant boost, you have my apologies.

However when it comes to compression -

The dynamic compression ratio (DCR) is always lower than the static compression ratio (SCR)
The DCR does not change at any time during the operation of the engine

EDIT: To the mods and thread readers, sorry I didn't realise the statement would drag on so much...
 

classy

Lifer
Oct 12, 1999
15,219
1
81
Originally posted by: classy
Originally posted by: keysplayr2003
Originally posted by: JEDIYoda
I see all the closet Amd supporters are out in force..lol

Yes. The CPU forum will be getting very, very ugly the next few weeks. I'll have my work cut out for me.

You'll be the main culprit, you might have to ban yourself :wine:

This is a mod call out ... there will be no more of this and no further warnings

CPU moderator apoppin


Instead Ill pm you

Thank-you for your explanation that it was 'in jest' ... please just no more giving the appearance of it. Yours was just 1 of 3 poster's comments that needed to be edited.
CPU Moderator apoppin
 

Scottie Wang

Junior Member
Sep 5, 2007
1
0
0
SUPER PI 26S for a 2G or 3G BARCELONA?
CONFUSED...

Originally posted by: Gary Key
I've never heard of any case whereby features become active at certain clock speeds. If this is indeed the case, then my apologies, but it doesn't make sense. Why would AMD deactivate features at lower clock speeds?

Throughout the entire prototype and pre-production (as stated in my last message) process, certain features on the CPU, in the BIOS, or on the chipsets have been turned off/on, latencies have changed, etc, etc. This is a normal part of the engineering process as the design is fleshed out and finalized. It does not represent final silicon capabilities and performance.


As I said earlier, I used a poor example as it was not meant to be taken literally spec for spec when comparing engines and CPUs. Regardless of the example, the point was that the platform performance improved significantly as the core speeds improved and this included performance per watt among other indicators. There is a myriad of reasons as to why this occured but considering the early silicon, BIOS, and chipset designs, we could only speculate as to why and I tried to present a few reasons that we honed in on.

If you compare a B00 chip from May to a B02 today, there is a significant difference in performance in all areas (26 seconds in SuperPI 1m for one) and my comments represent observations of what has occurred over this time period. We have final silicon now and results will be posted in the near future. My observations today are different than they were two weeks ago and as the platform matures they will change again.

Once we see the HT 3.0 capable chipsets and Phenom cores mature then we will have an even better indication of the performance of this core design in the consumer market but for now the initial release is Barcelona in the enterprise market.

 

Hulk

Diamond Member
Oct 9, 1999
5,355
4,055
136
I for one am happy to see Barcelona doing well on this benchmark. Regardless of who ran the test or the benchmark itself. This at least shows Barcelona may yet be a competitor.

I have no dog in this fight except the new dog. I always want the new chip coming out to top the old one. I was happy when C2D raised the bar for X2 and I'll be happy if Barcelona does the same to C2D.

If IPC for Barcelona is better than C2D then that is great news.

I would think it's easier to scale frequency than to be stuck with a underperforming core from an IPC point of view.
 

bryanW1995

Lifer
May 22, 2007
11,144
32
91
Originally posted by: Scottie Wang
SUPER PI 26S for a 2G or 3G BARCELONA?
CONFUSED...

Originally posted by: Gary Key
I've never heard of any case whereby features become active at certain clock speeds. If this is indeed the case, then my apologies, but it doesn't make sense. Why would AMD deactivate features at lower clock speeds?

Throughout the entire prototype and pre-production (as stated in my last message) process, certain features on the CPU, in the BIOS, or on the chipsets have been turned off/on, latencies have changed, etc, etc. This is a normal part of the engineering process as the design is fleshed out and finalized. It does not represent final silicon capabilities and performance.


As I said earlier, I used a poor example as it was not meant to be taken literally spec for spec when comparing engines and CPUs. Regardless of the example, the point was that the platform performance improved significantly as the core speeds improved and this included performance per watt among other indicators. There is a myriad of reasons as to why this occured but considering the early silicon, BIOS, and chipset designs, we could only speculate as to why and I tried to present a few reasons that we honed in on.

If you compare a B00 chip from May to a B02 today, there is a significant difference in performance in all areas (26 seconds in SuperPI 1m for one) and my comments represent observations of what has occurred over this time period. We have final silicon now and results will be posted in the near future. My observations today are different than they were two weeks ago and as the platform matures they will change again.

Once we see the HT 3.0 capable chipsets and Phenom cores mature then we will have an even better indication of the performance of this core design in the consumer market but for now the initial release is Barcelona in the enterprise market.
that is almost definitely 2g barcelona since gary only has a 2ghz barcelona cpu to play with. I get 14.283 on my cpu at 3.512 ghz. hmmmm, I wonder what I'll get at 2...

edit: uh, heh heh, it's kinda funny that my ram is NOT stable at 4-4-3-11 1000...go figure. anyway, I dropped the cpu down to 8x250, ran 2:3 memory at 4-4-3-11, and got 25.328 superpi. that puts the 26 seconds that gary recorded in perspective, though it's still a tremendous improvement from the high 30's that was rumored last week.

 

Pederv

Golden Member
May 13, 2000
1,903
0
0
"If you look at performance instructions, Barcelona is about 30 percent faster than Clovertown. However, if you look at energy instructions, Clovertown is about 30 percent faster than Barcelona," Dell said.

EETimes

Wasn't that helpful.
 

HopJokey

Platinum Member
May 6, 2005
2,110
0
0
Originally posted by: Pederv
"If you look at performance instructions, Barcelona is about 30 percent faster than Clovertown. However, if you look at energy instructions, Clovertown is about 30 percent faster than Barcelona," Dell said.

EETimes

Wasn't that helpful.

WTH is a "performance instruction" and a "energy instruction"?
 

Pederv

Golden Member
May 13, 2000
1,903
0
0
Originally posted by: HopJokey
Originally posted by: Pederv
"If you look at performance instructions, Barcelona is about 30 percent faster than Clovertown. However, if you look at energy instructions, Clovertown is about 30 percent faster than Barcelona," Dell said.

EETimes

Wasn't that helpful.

WTH is a "performance instruction" and a "energy instruction"?

I'm thinkin' it breaks down to what we have known for a long time - AMD CPUs are better at some things, Intel CPUs are better at other things.
 

classy

Lifer
Oct 12, 1999
15,219
1
81
Originally posted by: HopJokey
Originally posted by: Pederv
"If you look at performance instructions, Barcelona is about 30 percent faster than Clovertown. However, if you look at energy instructions, Clovertown is about 30 percent faster than Barcelona," Dell said.

EETimes

Wasn't that helpful.

WTH is a "performance instruction" and a "energy instruction"?

LOL, I was thinking the same thing.
 

lopri

Elite Member
Jul 27, 2002
13,327
708
126
What I read is this?

"If you look at floating point instructions, Barcelona is about 30 percent faster than Clovertown. However, if you look at integer instructions, Clovertown is about 30 percent faster than Barcelona,"

Of course that 30% number is out of Mr. Dell's a**, but the relative expectation is in line with what's known - Barcelona will be very fast in FP calculations. And this enhancement, while understandable (AMD had always been behind Intel when it comes to FP), makes me scratch my head because it doesn't benefit servers as much (as Integers). Workstations, maybe.
 

Amaroque

Platinum Member
Jan 2, 2005
2,178
0
0
Originally posted by: Pederv
"If you look at performance instructions, Barcelona is about 30 percent faster than Clovertown. However, if you look at energy instructions, Clovertown is about 30 percent faster than Barcelona," Dell said.

EETimes

Wasn't that helpful.

My interpretation of the quote is this...

If you look at equal power draw on both systems, Clovertown is 30% faster at the same power usage. If you look at IPC Barcelona is 30% faster clock for clock.

But Clovertown will be clocked at least 30% higher, so it might be a moot point.
 

bryanW1995

Lifer
May 22, 2007
11,144
32
91
that would be an amazing pull-out-of-their-ass manuever from amd to get 30 % clock for clock advantage. that would put a 2.5 ghz barcelona equivalent to a 3.25 ghz penryn. The energy efficiency of the 45 nm process also makes sense, allowing intel to clock higher at the same power draw. of course, if barcelona is 30% faster clock for clock and clovertown is 30 % more energy efficient at the same clock speed, they should end up pretty close to equal in power draw at the same "equivalent speed".
 

BitByBit

Senior member
Jan 2, 2005
474
2
81
I'd be very surprised if Cloverton was 30% fast per Watt, given K10's power saving features. If K10 is indeed faster per clock, I'd expect it to have a slight advantage in Performance/Watt. 30% per Watt is roughly the advantage Core holds over K8.
 

Gary Key

Senior member
Sep 23, 2005
866
0
0
Depending upon the application, Barcelona clock for clock is equal to Clovertown or just a tad faster in some areas and in others it is 20%+ (heavy emphasis on "plus" until Monday, ;) ) faster. I have to say that final silicon is looking really nice at this point, especially in memory sensitive applications where this processor shines, but core speeds need to come up in a hurry to compete with Tigerton in general server applications.

The memory bandwidth numbers we noticed are a little surprising considering the results in certain memory intensive applications. The latencies are much improved over the last core stepping we looked at as is pure throughput which makes you think twice about the relationship between bandwidth and performance in general from this CPU.

My personal opinion is that if AMD had hit their original launch targets and speeds, the server market would be incredibly competitive at this point with Tigerton being a much needed response to Barcelona. As it stands, AMD is going to have an uphill battle at this time. But this is all subjective thinking from a guy who tried to compare an engine to a CPU, but maybe the results on Monday will clarify some of the observations I was trying to convey without breaking the rules. :D



 

Stoneburner

Diamond Member
May 29, 2003
3,491
0
76
Originally posted by: Gary Key
Depending upon the application, Barcelona clock for clock is equal to Clovertown or just a tad faster in some areas and in others it is 20%+ (heavy emphasis on "plus" until Monday, ;) ) faster. I have to say that final silicon is looking really nice at this point, especially in memory sensitive applications where this processor shines, but core speeds need to come up in a hurry to compete with Tigerton in general server applications.

The memory bandwidth numbers we noticed are a little surprising considering the results in certain memory intensive applications. The latencies are much improved over the last core stepping we looked at as is pure throughput which makes you think twice about the relationship between bandwidth and performance in general from this CPU.

My personal opinion is that if AMD had hit their original launch targets and speeds, the server market would be incredibly competitive at this point with Tigerton being a much needed response to Barcelona. As it stands, AMD is going to have an uphill battle at this time. But this is all subjective thinking from a guy who tried to compare an engine to a CPU, but maybe the results on Monday will clarify some of the observations I was trying to convey without breaking the rules. :D

I wish there were more editors like you willing to throw morsels to us salivating masses. Any chance you could be a little (just a little) more specific on what areas barcelona and tigerton are roughly equal?

 

Viditor

Diamond Member
Oct 25, 1999
3,290
0
0
Originally posted by: Gary Key
Depending upon the application, Barcelona clock for clock is equal to Clovertown or just a tad faster in some areas and in others it is 20%+ (heavy emphasis on "plus" until Monday, ;) ) faster. I have to say that final silicon is looking really nice at this point, especially in memory sensitive applications where this processor shines, but core speeds need to come up in a hurry to compete with Tigerton in general server applications.

The memory bandwidth numbers we noticed are a little surprising considering the results in certain memory intensive applications. The latencies are much improved over the last core stepping we looked at as is pure throughput which makes you think twice about the relationship between bandwidth and performance in general from this CPU.

My personal opinion is that if AMD had hit their original launch targets and speeds, the server market would be incredibly competitive at this point with Tigerton being a much needed response to Barcelona. As it stands, AMD is going to have an uphill battle at this time. But this is all subjective thinking from a guy who tried to compare an engine to a CPU, but maybe the results on Monday will clarify some of the observations I was trying to convey without breaking the rules. :D

Thanks for that Gary...are you saying you are benching Tigerton as well?
Are you using it in a 4S configuration?