Dual Shootout at Tomshardware

DSTA · Mar 13, 2002

I have a Dual Prestonia system (@2G) on a SM mobo. [...]
It's one of my home systems, I don't use it for any business apps.

Holy cow. If you ever feel like sharing investment tips, please send me PM.

Agree about the rest of your post. It's very much like Jets vs. Sharks only that there are no pretty girls around 🙂.

imgod2u · Mar 13, 2002

<< I know that. I was generalizing by adding the mhz of both cpu's . At how the real world would see it. Only making a statement. I was implying the dissappointment of the performance of the Xeon cpu's. Nothing more. I was expecting much more and instead got much less. I still can't see me spending so much money on Intel. The bank for buck is simply not there. >>

What do you mean by the "real world"? Any well informed person (computer-wise) knows that dual 1 GHz processor does not equal a 2 GHz. There is no such thing as generalizing with such things when it is blatantly false.

You're still under the impression that MHz should equal performance. Which it most certainly does not. A processor at 500 GHz does not neccessarily mean it'll be faster than a 500 MHz processor if the 2 are completely different architectures. Comparing MHz between 2 different architectures is not only pointless, but saying something isn't good just because it has less IPC is just nonsense.

Nobody expected the Xeon/Northwood architecture to be a huge leap over the AthlonMP/Palomino, but the point is, it is faster at the same speed grade. And, may I add, as AMD and zealots have been preaching for so long, MHz means jack.

CraigRT · Mar 14, 2002

I'd have to sell my 2001 model car for a set of those ridiculously overpriced rip off Xeons.. and then sell my mountain bike for the mobo... I don't get what the advantage of these Intel CPU's is so much that they can justify such a high price... they are just NOT worth it ... the product may be decent, but when compared to an AthlonXP which proves to turn in similar scores, with a low lower clock speed, and alot lower price.. it makes Intel look bad.. but that's just me.

Viper GTS · Mar 14, 2002

<< I'd have to sell my 2001 model car for a set of those ridiculously overpriced rip off Xeons.. and then sell my mountain bike for the mobo... >>

Xeon 2.2's are $646 on PriceWatch. Do you drive a Daewoo?

<< I don't get what the advantage of these Intel CPU's is so much that they can justify such a high price... they are just NOT worth it ... the product may be decent, but when compared to an AthlonXP which proves to turn in similar scores, with a low lower clock speed, and alot lower price.. it makes Intel look bad.. but that's just me. >>

MHz doesn't mean anything anymore! The end result is speed, how the individual company gets there makes no difference. Fact of that matter is that the P4 Xeon @ 2.2 is faster than the AMD MP 2000+. Yes, it has a 533 MHz clock advantage. But who cares? Since day one we've know the P4 performed poorly on a clock for clock comparison. We've also known that the same characteristic that caused that would also enable it to soar to high clock speeds, thus overcoming it's low clock for clock performance.

The price point is definitely in AMD's favor. I'm typing this on a dual XP 1800+ system. I intend to continue with the dual AMD platform. I can't afford (or justify) spending the money for the Xeon.

But for the people that can, the P4 is the current performance leader. There's no arguing that.

Viper GTS

WilsonTung · Mar 14, 2002

<< The price point is definitely in AMD's favor. I'm typing this on a dual XP 1800+ system. I intend to continue with the dual AMD platform. I can't afford (or justify) spending the money for the Xeon >>

This is certainly true. 600+ bucks for a Xeon is out of the reach for most people. 600 bucks is about 2x my yearly budget for computer equipment. But the product is targeted towards companies who have the money to blow on the hardware and optimized software.

Pariah · Mar 14, 2002

"But to summarise my previous rant: IMO the apps selected by THG do not represent what is used out there the field."

I don't really understand why THG is testing Xeon's in the first place, based on their target market being the corporate world, and THG's being the enthusiast, but that's a different issue.

zemus · Mar 14, 2002

If all apps where truely Athlon optimized, the Athlon would proabbly beat any of Intels offerings in almost everything, Well not the silly sisoft sandra memory bandwidth test ( of course not ).

If the current trend holds up, more App makers will start to pay attention to AMD optimizations, we are still a little away from that yet though.

My prediction is that high end apps will quickly start optimzing for hammer, and after thar the quickly trickle down to the rest of us shortly after. App makers will find it very hard to ignore hammer as Intel has no compareable competitor to it ( offcially anyway )

----

btw, on the topic of hammer, I see alot of people getting a little excited about it. I think as far as us goes, it will probabbly not be all that impressive while running our 32 bit software. it will probabbly just be a gradule upgrade from the athlon. I highly doupt it will be like the first athlon luanch for us. It however, does have the potential to be very well excepcted into work stations and those who can get the the code compiled for it. It will probabbly be the workstation chip of choice during the transition to 64 bit ( which will probabbly be a 10 year process ). At least until the time that Intel follows AMD with this secret hammer clone we hear about.

--- Intel is not that stupid though even though the last few years seems to suggest so, They will have a response to all of this

----

Anyway, remaining on topic, AMD definatly does not get a fair represenation in most benchmarks in use today in our hobby community.

If I where to be completely honest as a AMD-fan, I would say over all, both platforms, regardless of any biasing or this and that, are performing about equal. Bascially, both companies are remaining pretty competiitve with eachother. the only thing that is sort of annoying is this perception by many that Intel still dominates AMD (general public of course )

PH0ENIX · Mar 14, 2002

<< Adding both cpu's which = total mhz which Xeon = over 1ghz advantage >>

formulav8 - You are still wrong. For an SMP system adding the MHz of the processors is simply invalid. A dual 600 MHz Katmai system for example won't perform as well as a 1200 MHz Tualatin in the vast majority of cases.

To say 600 MHz processor x2 = 1200 MHz is false. The CPUs are still running at 600 MHz.

And the MHz : MHz ratio of Xeon : MP is still the same even under your strange conclusion. >>

I know that. I was generalizing by adding the mhz of both cpu's . At how the real world would see it. Only making a statement. I was implying the dissappointment of the performance of the Xeon cpu's. Nothing more. I was expecting much more and instead got much less. I still can't see me spending so much money on Intel. The bank for buck is simply not there.

Well,
Care to explain how adding the Mhz of the 2 CPUs achieves anything at all? Besides a completely meaningless and altogether useless number of course...

A dual 600mhz system is STILL a 600mhz system.
Even if you believe the rough equation that the performance is equivalent to ~150% of a single CPU, it's still not a 900mhz system.

It's just a 600mhz system with a much better IPC rating, and of course true multi-threading.

Last time I checked, with Dual systems, the users of them in the 'Real World' tend to know a little bit about SMP.
Because of that, the 'Real World' wouldn't be adding the mhz up... or at least i'd hope not.

As far as bang for buck goes - we're talking a CPU intended for servers here.
I know that if I was a large corporation i'd be vying for performance and stability as a premium, not bang-per-buck.
Price/Performance ratio is an important thing, but it really doesn't come into it with this.

For the same reason a company will spend $2000 on a p2 450 cpu, to make one of their servers SMP, rather than just putting that inordinate cost towards another, more powerful server.

Unfortunately, general retail market rules dont really apply to this situation, methinks.

And you dont see yourself buying a Xeon system; well good, No doubt Intel's customer support line is busy enough.

BFG10K · Mar 14, 2002

Adding both cpu's which = total mhz which Xeon = over 1ghz advantage.

Don't be so naive. Two 1 GHz processors do not make a 2 GHz processor, nor do two processors double the speed of one running at the same speed.

Priit · Mar 14, 2002

Last time in THG's SMP review they compiled linux kernel faster with singel processor than dual. Now there's no linux kernel compilation benchmark at all (but we all know who would win that anyway 🙂 and mp3 encoding is done with some exotic encoder. Synthetic crap like Sysmark and Sanda is still there, at least they don't try to run Quake on it any more (wonder when will we see Solitare benchmark on 4-way SMP machines... 😉 )

formulav8 · Mar 14, 2002

<< What do you mean by the "real world"? Any well informed person (computer-wise) knows that dual 1 GHz processor does not equal a 2 GHz. There is no such thing as generalizing with such things when it is blatantly false.

You're still under the impression that MHz should equal performance. Which it most certainly does not. A processor at 500 GHz does not neccessarily mean it'll be faster than a 500 MHz processor if the 2 are completely different architectures. Comparing MHz between 2 different architectures is not only pointless, but saying something isn't good just because it has less IPC is just nonsense.

Nobody expected the Xeon/Northwood architecture to be a huge leap over the AthlonMP/Palomino, but the point is, it is faster at the same speed grade. And, may I add, as AMD and zealots have been preaching for so long, MHz means jack. >>

Tell that to Intel and the 99.99 percent of people in the world. I know about the mhz myth. I wasn't talking about me or you. Intel is telling everyone that mhz = performance or they wouldn't market the cpu in that manner. They purposely designed the P4 for mhz and not power in the more traditional sense(I see it that way). They wanted to get the clockspeed up to market the cpu as being fast because it has a high mhz rating. To sell to the 99.99% of idiots in the world that no nothing about ipc and mhz.

formulav8 · Mar 14, 2002

<< Adding both cpu's which = total mhz which Xeon = over 1ghz advantage.

Don't be so naive. Two 1 GHz processors do not make a 2 GHz processor, nor do two processors double the speed of one running at the same speed. >>

How about reading my thread before you write yours. Thankyou

Sunner · Mar 14, 2002

I hafta agree that the benches seemes a little biased towards Intel.

But anyways, Intel clearly leads in the performance department right now, while AMD rules the Price/performance department.

Nate420 · Mar 14, 2002

<< I don't see why there is any reason to bring up clock speed. Who cares what clock speed the chips run at? It's an apples to oranges comparison between a P4 and Athlon. You think anyone in the business world is basing purchases on IPC ratings? Why doesn't AMD relelase a 2.2GHz XP so these meaningless arguements can stop? Because they can't. And why can't they? Because it is a different architecture than the P4 that was not designed to scale to insane clock rates. It's really sad all the excuses people come up with. AMD had the performance lead throughout most of the Athlons existence, but the pendulum has now swung to Intel's side. With Clawhammer it may swing back to AMD, who knows. Give credit where credit is due, Intel won this round. >>

^^I couldn't have said it better. That's real.^^

dullard · Mar 14, 2002

<< For the cost of a dual xenon rig I could build 2 dual XP rigs. >>

Lets check the facts.

Assumptions:
1) Typical businesses buy pre-built machines.
2) Typical businesses with dual processor machines have a base level that is something like this:

Dual processor (Xeon or Athlon MP)
1GB RAM (RDRAM for Xeon DDR for Athlon MP)
80 GB IDE drive
Matrox G450
Win XP Pro
52x CDROM
Keyboard
Mouse
Floppy
(Assume monitor and other necessary equipment is already available at the business).

Now to be fair, lets compare the prices at the same store. Not many places sell both dual AMD and dual Intel but here is one of them that does.
Dual 2.2 GHz Xeon: $3316
Dual 2000+ MP: $2228 (49% cheaper)
Definately can't get two Athlon systems for the price of one Xeon system.

What if you were on a budget and went for a slightly lower speed?
Dual 2.0 GHz Xeon: $2854
Dual 1900+ MP: $2118 (35% cheaper)
Still I can't see you getting two for the price of one.

Lets see if you went all out and bought a monitor, fast HDs, and fast video card:
-Two 36 GB 15,000 rmp Cheetah X15
-Single Sony G521 21"
-Wildcat II 5110

Dual 2.2 GHz Xeon: $6723
Dual 2000+ MP: $6097 (18% cheaper)
Dual 2.0 GHz Xeon: $7185
Dual 1900+ MP: $5987 (12% cheaper)

Hmm, still can't get two for the price of one... What am I doing wrong? At the fully equiped level the Athlon is 18% less expensive yet performs up to 30% slower (depending on which program your business uses). That means that in this case, Xeon has better price/performance ratio... Guess what? Businesses do this same math when deciding which to purchase: spending 18% more and get 30% speed boost makes Xeons seem quite good.

Note: to get these prices I skimped on the case and powersupply for the Athlons - if I didn't the prices would be even closer.
Note2: these don't include shipping or the price of the specialty software - if I included these the price % difference would be even much smaller. For example scientific computational fluid dynamics software costs an average of about $4000 per machine per year. For the first year of use the last computers listed will cost $10823 and $10087 respecively (a 7% difference).

Sohcan · Mar 14, 2002

<< They purposely designed the P4 for mhz and not power in the more traditional sense(I see it that way). They wanted to get the clockspeed up to market the cpu as being fast because it has a high mhz rating. >>

Traditional sense? Well, you're wrong. The P4 team by no means developed the speedracer paradigm, it is far older than you think. The Alpha EV4 and EV5 were both speedracers; the 225MHz EV45 was only slightly faster in SPEC than the 120 MHz PA-7150, and the 600 MHz EV56 performed slightly ahead of the 240 MHz PA-8200. The bottom line is that the Alphas provided the best performance; no one complained about clock speed then, but then again they didn't have to deal with the enthusiast market. At 14-stages, the Power4 is an extremely deeply pipelined RISC microprocessor, a field in which most OOOE superscalar designs have 4 to 7 pipeline stages. The 1.3GHz Power4 is clocked 30% faster than the nearest RISC competitor, yet it currently is in the lead in both SPECint and SPECfp performance. It is likely to be trumped later this year by the slower clocked EV7 (SPECint and SPECfp)and McKinley (in SPECfp, its SPECint score will be close but might not surpass the Power4), at 1.2GHz and 1GHz respectively. Again, when the Power4 shrinks to .13u, it will likely be clocked significantly faster than the competitors, and may likely once again hold the top spot in SPEC performance.

The P4 design started in 1996, before the K6 was released and before Dirk Meyer moved to AMD. The architectural phase is the first part of any design, and lasts 6 to 9 months. Intel simply did not have the competition at the time to warrant its engineering team to be dictated by marketing (a silly suggestion by itself). Looking ahead to the 65nm process node, the team wanted to reduce interconnect propogation delay as much as possible, since wire delay, which has dominated MPU design since 1 micron, does not significantly reduce with process shrinks. Hence the:

- trace cache (eliminating long wires associated with highly parallel complex x86 decoders and large multiplexors)
- two-stage ALUs operating at twice the core frequency (which uses smaller carry-lookahead units, optimizes wire length in the bypass network, and uses less area than twice the number of "normal" ALUs)
- long pipeline with two stages dedicated to propogating electrical signals
- small L1 data cache optimized for set-associativity and low latency (about which the enthusiast crowd is severly misinformed...the P4's 4-way associative, 2-cycle load-use 8KB L1D cache achieves an average 3.9% miss-rate, compared the the Athlon's 2-way associative, 3-cycle load-use 64KB L1's 1.8%)
- large reservation stations and renaming register pools to hide the ever increasing main memory latency.

You could dispute the design decisions if you like, but do so in a manner other than moaning about the P4's 30% higher clock-rate.

<< To sell to the 99.99% of idiots in the world that no nothing about ipc and mhz. >>

Misinformation about IPC is just as prevalent as misinformation about clock-rate. I see the enthusiast crowd around here throw around the term "IPC" as if its a cut-and-dry metric, without understanding what it is beyond its definition:
- the factors that influence it (fetch rate, issue rate, retire rate, superscalar width, OOOE window, in-order vs. out-of-order retirement, reservation station size, renaming pool, number of architectural registers, pipeline length, branch misprediciton penalty, instruction latency characteristics, cache heirarchy (number of levels, size, set-associativity, latency, block size, bandwidth, multiport fetch rate, replacement algorithms, exclusive vs. inclusive vs. both), branch history table size, main memory bandwidth and latency, TLB size and hit-rate, paging & segmentation schemes, 2-operand vs. 3-operand format instructions, variable-length vs. fixed-length instructions, the kitchen sink....
- its affect on clockspeed
- software's impact on it

Alex · Mar 14, 2002

<<

<< Is this a typo? >>

Yep.

<< The Xeon 2.2 GHz pretty much whacked the Athlon MP 2000+. >>

I wouldn't say whacked. The 2.2 GHz shows the same slight lead over the 2000+ that is seen with single processors in all other single benchmark tests. Who cares about a few %... >>

also, keep in mind that xeon is 2.2ghz and athlon is 1.67ghz.... with a +0.5ghz clock difference and such a small performance difference imagine what will happen when the athlonmp reaches 2.2ghz... it will beat the xeon like it stole something!

fkloster · Mar 14, 2002

<< ... imagine what will happen when the athlonmp reaches 2.2ghz... it will beat the xeon like it stole something! >>

lol

what a foolish statement

A 2002 Dodge Viper GTS would surely 'trounce' a Ford Model-T in the quarter mile as well (same foolish logic), you moron, but how many vipers were around when the first Model-T rolled off the assembly line?

Do you have a clue where Xeon projects will be when AMD squeezes out a 2.2 ghz MP part?

ST4RCUTTER · Mar 14, 2002

Some things to consider...

-The comparison is between processors on two different nodes (.13um vs. .18um). It shouldn't really come as a shock that the McKinleys edge out the competition, especially given the similar results of single CPU based systems. AMD should begin shipping their faster Thoroughbreds and MP's very shortly (this month), so the competition in this market segment will only increase.

-Price comparisons using single workstations are only part of the picture. A company buying 50 such workstations could see a significant savings using AMD based systems over current Intel solutions. That said, when Intel begins to increase the L2 cache to 3MB and above, AMD won't have a comparably equipped chip until Sledgehammer.

Another good post Sohcan. 😀

formulav8 · Mar 14, 2002

<<

<< They purposely designed the P4 for mhz and not power in the more traditional sense(I see it that way). They wanted to get the clockspeed up to market the cpu as being fast because it has a high mhz rating. >>

Traditional sense? Well, you're wrong. The P4 team by no means developed the speedracer paradigm, it is far older than you think. The Alpha EV4 and EV5 were both speedracers; the 225MHz EV45 was only slightly faster in SPEC than the 120 MHz PA-7150, and the 600 MHz EV56 performed slightly ahead of the 240 MHz PA-8200. The bottom line is that the Alphas provided the best performance; no one complained about clock speed then, but then again they didn't have to deal with the enthusiast market. At 14-stages, the Power4 is an extremely deeply pipelined RISC microprocessor, a field in which most OOOE superscalar designs have 4 to 7 pipeline stages. The 1.3GHz Power4 is clocked 30% faster than the nearest RISC competitor, yet it currently is in the lead in both SPECint and SPECfp performance. It is likely to be trumped later this year by the slower clocked EV7 and McKinley (in SPECfp), at 1.2GHz and 1GHz respectively. Again, when the Power4 shrinks to .13u, it will likely be clocked significantly faster than the competitors, and may likely once again hold the top spot in SPEC performance.

The P4 design started in 1996, before the K6 was released and before Dirk Meyer moved to AMD. The architectural phase is the first part of any design, and lasts 6 to 9 months. Intel simply did not have the competition at the time to warrant its engineering team to be dictated by marketing (a silly suggestion by itself). Looking ahead to the 65nm process node, the team wanted to reduce interconnect propogation delay as much as possible, since wire delay, which has dominated MPU design since 1 micron, does not significantly reduce with process shrinks. Hence the:

- trace cache (eliminating long wires associated with highly parallel complex x86 decoders and large multiplexors)
- two-stage ALUs operating at twice the core frequency (which uses smaller carry-lookahead units, optimizes wire length in the bypass network, and uses less area than twice the number of "normal" ALUs)
- long pipeline with two stages dedicated to propogating electrical signals
- small L1 data cache optimized for set-associativity and low latency (about which the enthusiast crowd is severly misinformed...the P4's 4-way associative, 2-cycle load-use 8KB L1D cache achieves an average 3.9% miss-rate, compared the the Athlon's 2-way associative, 3-cycle load-use 64KB L1's 1.8%)
- large reservation stations and renaming register pools to hide the ever increasing main memory latency.

You could dispute the design decisions if you like, but do so in a manner other than moaning about the P4's 30% higher clock-rate.

<< To sell to the 99.99% of idiots in the world that no nothing about ipc and mhz. >>

Misinformation about IPC is just as prevalent as misinformation about clock-rate. I see the enthusiast crowd around here throw around the term "IPC" as if its a cut-and-dry metric, without understanding what it is beyond its definition:
- the factors that influence it (fetch rate, issue rate, retire rate, superscalar width, OOOE window, in-order vs. out-of-order retirement, reservation station size, renaming pool, number of architectural registers, pipeline length, branch misprediciton penalty, instruction latency characteristics, cache heirarchy (number of levels, size, set-associativity, latency, block size, bandwidth, multiport fetch rate, replacement algorithms, exclusive vs. inclusive vs. both), branch history table size, main memory bandwidth and latency, TLB size and hit-rate, paging & segmentation schemes, 2-operand vs. 3-operand format instructions, variable-length vs. fixed-length instructions, the kitchen sink....
- its affect on clockspeed
- software's impact on it >>

You can talk about certain aspects of cpu design all you want. It means nothing in the real world. If it did the P4 would be a big powerhouse but it is not. Just like in synthetic benches. The P4 has all of that memory bandwidth, but where is it all going to? Real world tests prove this. Design and real world are completely different. The only reason the P4 does good in most of it benches it looks good in is with SSE2 optimizations. I have no need to talk about this anymore as we are looking at it from diffferent sides. I do see where you are coming from, but I don't agree.

Jason

fkloster · Mar 14, 2002

<< -The comparison is between processors on two different nodes (.13um vs. .18um). >>

A Pentium 4 2.0ghz socket 423 (.18u) part 'would be' identical in performance to a Pentium 4 2.0ghz socket 478 (.13u) part if the cache were the same size (they are not 256 vs. 512)

ST4RCUTTER · Mar 14, 2002

A Pentium 4 2.0ghz socket 423 (.18u) part 'would be' identical in performance to a Pentium 4 2.0ghz socket 478 (.13u) part if the cache were the same size (they are not 256 vs. 512)

I was alluding to the increase in processor speed that will accompany the change in process...

Sohcan · Mar 14, 2002

<< You can talk about certain aspects of cpu design all you want. It means nothing in the real world >>

CPU design means nothing in the real world? What are you talking about?

Are you saying that interconnect delay is not an issue? I suppose that all the EEs have been wrong for the past decade? These are real issues that any MPU design has to worry about. I did design and test the complete logic and RTL schematic of a 16-bit MIPS RISC microprocessor (~20,000 logic gates) featuring a 5-stage pipeline and full data bypass, DRAM controller, and 4KB direct-mapped cache for my comp arch class, I do believe that this gives me some insight into microprocessor design.

This is exactly the kind of enthusiast mentality around here of which I'm really getting sick: that there's one and only one way to do anything in microprocessor design. If that is true, I guess I'm wasting my time by going to graduate school to study computer architecture. I never said that the P4 design is the only way to approach OOOE superscalar architecture; I'm talking about design decisions, not design absolutes.

<< If it did the P4 would be a big powerhouse but it is not >>

You're paying too much attention to marketing. The microprocessor designers know how their design will perform, it's the marketing morons who take something like the P4's two-stage ALU's at 2X the global clock rate and imply that it provides 2X the performance. The MPU designers know that the two "double-pumped" ALUs serve the purpose to provide the same amount of throughput of four normal ALUs, with less area and heat output, as well as the ability to issue two sequential instructions with read-after-write data dependencies at the same time.

<< Just like in synthetic benches. The P4 has all of that memory bandwidth, but where is it all going to? Real world tests prove this. >>

Yet another false mentality that the enthusiast crowd expresses often, that performance is directly related to memory bandwidth. Amdahl's law states that speedup = 1/(1-f+f/s), where s = optimization speedup and f = fraction affected by optimization. Anybody educated in CS, CE, or EE understands this, its not our fault if marketing departments and the enthusiast crowd don't.

<< The only reason the P4 does good in most of it benches it looks good in is with SSE2 optimizations >>

Intel says that auto-parallelizing compilers produce between 5% and 10% speedup from SSE2 in most applications.

<< I do see where you are coming from, but I don't agree. >>

What exactly don't you agree with? It's kind of difficult to carry a discussion if you're going to be this vague.

454Casull · Mar 14, 2002

fkloster, you idiot, the Model T had 40hp. Add a 100-shot of nitrous, a turbo, and a couple of Type-xR stickers and the Viper would be getting _raped_.

😉

zemus · Mar 14, 2002

wow, fkloster is till around, remember me calling you on the phone from canada buddy !

Dual Shootout at Tomshardware

Senior member

Senior member

Lifer

Lifer

Senior member

Elite Member

Member

Member

Lifer

Golden Member

Diamond Member

Diamond Member

Elite Member

Senior member

Elite Member

Platinum Member

Diamond Member

Diamond Member

Platinum Member

Diamond Member

Diamond Member

Platinum Member

Platinum Member

Banned

Member