P4 vs. P3 vs. Mustang vs. T-Bird

sciencewhiz

Diamond Member
Jun 30, 2000
5,885
8
81
Let us assume the hypothetical situation where you were able to get each of these chips (P4, P3, Mustang, and T-Bird) at the same clock-speed and with the same memory subsystem. In other words, the only things to take into account are the chip design itself and the cache architecture.

Which of these chips would perform the best? The way that I see it, there are 3 possibly correct answers and 1 definetly wrong answer.

Please respond with which chip will be faster (and in which tasks) and what part of the microarchitecture causes this. This is not a flame bait, I am genuinely intersted in your responses.
 

Rigoletto

Banned
Aug 6, 2000
1,207
0
0
?! Strange question. The cpu with the least pipeline stages has the best chance because each stage is doing more work whilst fitting into the clock cycle. No doubt the Athlon with the 10 stage pipeline will be doing better than the P4 with its 20 stage pipeline when we shift the goalposts inappropriately like this. Forget about the P3- though that should be better than the P4 clock for clock, slightly.
There are other cutenesses of the P4 but I don't know that they count in practice. I think someone should read in Ace's hardware and P4 benchmarks before answering this question!
 

beat mania

Platinum Member
Jan 23, 2000
2,451
0
76
Depends on the memory subsystem you're referring to, since P4 was designed for the bandwidth available from Rambus, so if you don't provide that it'd be at a disadvantage.
Me, I'd just get the P4 and mustang anyway to complete my x86 cpu collection. (damn I'm missing a P3 xeon)
 

sciencewhiz

Diamond Member
Jun 30, 2000
5,885
8
81
In this hypothetical situation there is more than enough memory bandwidth for each processor.

Here is one way to approach the problem: The p4 is going to kill all the other processors because of SSE2.

Another way look at it is that the shortest pipeline would win (like Rigolleto said). But then you can't neglect the improved branch prediction that the p4 and mustang have.

Like I said earlier, I have already thought this out but am curious what other people have to say. I have read many articles at ACE's, BTW.
 

pen^2

Banned
Apr 1, 2000
2,845
0
0
it occured to me non-computer-savvy general public has started to catch on a crucial fact: you dont really need anything faster than 1ghz p3. is intel gonna stick to 'makes your internet faster' ads? i wonder what use anything faster than current generation stuff would have in fields other than 3d gaming.

edit: a bit OT heh, maybe i should have created a new thread about it :)
 

PCResources

Banned
Oct 4, 2000
2,499
0
0


<< Here is one way to approach the problem: The p4 is going to kill all the other processors because of SSE2. >>



Nope, nope and nope....

SSE2 will not kill anything. Just like SSE1 didn't help much.

There are some advantages to the P4 design, but i wouldn't call SSE2 one of them.

Patrick Palm

Am speaking for PC Resources
 

OneEng

Senior member
Oct 25, 1999
585
0
0
First off, you must consider the system as a whole, not just the processor.

If the processor were the only factor in benchmarks, the TBird should have handily beaten PIII .... it of course did not. TBird took some while PIII took others. The reason was the chipset and memory. PIII with RAMBUS gave PIII a much better memory subsystem that TBird with PC100. Once the KT133 was released, the picture changed somewhat, but still the Intel i815 dominated the VIA chipsets in memory performance.

PIII vs. TBird (Q1 2001)
Paired with DDR memory, TBird will distroy a PIII with DDR since TBird will have twice the FSB bandwidth to handle the memory with.

P4 vs. Mustang (Q1 2001)
Since the only memory platform P4 will allow will be RAMBUS, Mustang will have a major advantage. At the elevated frequencies of 1.5Ghz and higher, L1 and L2 cache misses will become a much larger piece of the performance pie. Mustang will have an enhanced cache scheme to help it out here along with double the amount of L2 cache as P4. Even if even odds are given to both processors Cache searches, Mustang will have better performance since it has 2 times the probability of finding information in Cache. To further degrade P4 performance, a cache miss is confounded by the very long latency of the RAMBUS memory while Mustang's DDR memory should lower the latency compared to TBird's PC133 subsystem.

With P4's &quot;hyperpiplined&quot; architecture (20 stages) and the RAMBUS memory subsystem, a branch mis-prediction is a catastrophy for the performance. Not only does the processor have to waste the 20 cycles it took to determine a problem existed, it has to flush and refill. This flush and re-fill will vastly increase the afore mentioned cache miss. All the way around this spells low instruction per clock (IPC).

The Mustang and its variants are going to rule the roost up to H2 of next year. P4 will NEED to be at 2Ghz in order to perform on par with a 1.5Ghz Mustang.

All of this will be a mute point for this year. AMD will be shipping Mustang for Christmas in volume at speeds of 1.3Ghz and perhaps even more. Intel will be shipping samples of P4 to hardware review sites. It would be an interesting sell at Best Buy where you paid more for a Dell 1Ghz than a 1.3Ghz Compaq. Q4 is going to be disasterous for Intel.

All is not gloom and doom for Intel. The P4 should be able to clock quite well. They have everyone convinced that they will ramp to 2Ghz by H2. Reguardless of the IPC the P4 achieves, clock speed still sells.

Case in point .... have you ever purchased anything for $10.01? No? More likely you paid $9.99 because the numbers look so much better.

P4 vs. Mustang isn't going to be about IPC (Instructions Per Clock), it is going to be about CLOCK PERIOD.

The better question would be which processor will clock higher!

Mustang will smash P4 on a clock per clock basis.
 

beat mania

Platinum Member
Jan 23, 2000
2,451
0
76
And THAT is why Intel isn't competing on clock by clock with AMD.

BTW, I listend to an Intel Marketing P4 presentation ... I've never heard so much cr@p in my life ... and yes, it looks like Intel's stick to making internet faster ad campaign. And Intel's grouped games with Internet.
 

Noriaki

Lifer
Jun 3, 2000
13,640
1
71
What I'm hoping for is that for the mustang they take the business end of the Athlon and pair it up with an excellent Branch Predictor (the K6's BP rocked). If they clean up the BP on the Mustang I'd say clock / clock it would be the fastest. I imagine clock for clock the Mustang will be slightly better than the TBird/CuMine, and most definately better than the P4 clock/clock, but Intel is relying on the P4 having very high clock speeds so I don't know if that's a 100% fair comparison.

But ignoring fairness and reality for a moment :) In your theoretical situation I'll say Mustang.
 

Soccerman

Elite Member
Oct 9, 1999
6,378
0
0
Noriaki, that's exactly what I think AMD will do.

they launched the K7 with an inferior branch predictor (compared to it's K6-2), probably just so that they could upgrade it later on (ala the next line of cores, mustang, palomino etc).
 

beat mania

Platinum Member
Jan 23, 2000
2,451
0
76
ugh ... now everyone is talking about branch prediction ... and then soon people will start talking about history tables and all that.

I think Intel should revive 486 using .13um or .11 or .06 or whatever just for the hell of it...486 10 Ghz DX400 =D then we'd see who's fast.
 

Mixxen

Golden Member
Mar 10, 2000
1,154
0
0
I think AMD released an inferior BP, is so that they could get the Athlon's out the door to compete with the PII/III's ASAP. Now pair the K6's superiour BP and the Athlon's power and you've got the Mustang kicking buttt all over the place.

The P4 will definatly lose clock vs. clock even if paired up against the P3 or TBird, becuase the pineline is very deep, and any misprediction will be very expensive.
 

Sephiroth_IX

Diamond Member
Oct 22, 1999
5,933
0
0
The mustang would be the fastest of the four due to improved branch prediction. The p3 coppermine would be second best, due to the 256-bit L2 cache pipeline (also shared by the Mustang, I am assuming) in comparison to the T-Bird's 64bit. As clock speeds increase, the performance crown goes to the P3 (vs Tbird) due to this simple above fact.

The P4 has far too long of a pipeline to be able to outrun any of the processors on a clock for clock basis, without **incredible** SSE2 optimization. (Intel benchmarks, anyone?)

 

Mixxen

Golden Member
Mar 10, 2000
1,154
0
0
The Intel486 was not superscalar. So the 486 could only execute at most...one instruction per clock. Thus there would be no way in hell the 486 could compete clock vs clock.
 

sciencewhiz

Diamond Member
Jun 30, 2000
5,885
8
81
I read somewhere that when running most modern applications, about .7 instructions are executed per clock. I this is the case, a 1 ghz 486 would be very appealing. It would have to have integrated cache, though.
 

ragiepew

Golden Member
Oct 9, 1999
1,899
0
0
sciencewiz... not true... the p3 averages slightly above 2 instructions per clock while the athlon goes in at about ~2.6 (i could be a bit off so please correct me on the exact numbers).

My vote goes for the Mustang clock for clock... but the clockspeed crown will be in intel's hands until clawhammer...
 

OneEng

Senior member
Oct 25, 1999
585
0
0
ragiepew,

but the clockspeed crown will be in intel's hands until clawhammer..

Don't be so sure.

Mustang on the .18um copper process should ramp to 1.5Ghz with little or no problem. Since P4 will not be availible this year in any signifigant volume (Intel has eluded to this as &quot;limited quantities&quot;), Q1 next year would be the earliest that it will have any impact on the market. I expect Mustang to be released at 1.3Ghz and ramp up to around 1.6Ghz before the die shrink.

AMD's existing equipment can produce .15um with only minor changes. With this die shrink I expect AMD to easily ramp the Mustang and its variants to 2Ghz+. Intel has been talking about ramping P4 to 2Ghz by H2 next year. This keeps the 2 companies about even.

AMD has a proven copper process. I doubt Intel can reach 2Ghz on aluminum interconnnects. There is this little issue of electromigration that no one is talking about. I strongly suspect that a chip produced on aluminum interconnect technology running near 2Ghz would fail due to contamination of the silicon region near the interconnects.

Check out the section in this article on electromigration for a more complete description of the issue: here
 

VladTrishkin

Senior member
Sep 11, 2000
421
0
0
Slow down people..


Rigoletto:


<< The cpu with the least pipeline stages has the best chance because each stage is doing more work whilst fitting into the clock cycle. >>



-I know what you are trying to tell us, but this is not 100% accurate. A longer pipiline has some advantages as well.



<< No doubt the Athlon with the 10 stage pipeline will be doing better than the P4 with its 20 stage pipeline when we shift the goalposts inappropriately like this. Forget about the P3- though that should be better than the P4 clock for clock, slightly.
>>



-both P3 and Athlon/Tbird have a 10 stage pipeline.



<< Here is one way to approach the problem: The p4 is going to kill all the other processors because of SSE2.
>>




PCResources:


<< SSE2 will not kill anything. Just like SSE1 didn't help much.

There are some advantages to the P4 design, but i wouldn't call SSE2 one of them.
>>



-Advanced SIMD extensions can improve overall CPU performance by up to 75% or more. I would call that a MAJOR part in this performance spectrum. SSE2 will feature 128 (or more) new SIMD optimazations.


 

VladTrishkin

Senior member
Sep 11, 2000
421
0
0
PII= 12 stage pipeline
PIII (katmai) = 12 stage pipeline
PIII (coppermine) = 10 stage pipeline
P4 Willimette = 20 stage pipeline

AMD K6=6 stage pipeline
Athlon/Tbird = 10 stage pipeline
Mustang= 12 stage (most likely) pipeline


 

Rigoletto

Banned
Aug 6, 2000
1,207
0
0
I still think the P4 risks falling flat on its face for the first six months because it will carry such a price premium and people will say: &quot;Do we need the extra 20% speed we might get out of this AT MOST, and all the teething problems too?&quot;
 

sciencewhiz

Diamond Member
Jun 30, 2000
5,885
8
81
The p4 IS going to fall flat on its face (if you only take into account savy computer users).

Unfortunatly, savy computer users make up less than 1% of the market. Why do you think AMD still has such a small market share?

Edit: I am still looking for the article I read about the average IPC.
 

Rigoletto

Banned
Aug 6, 2000
1,207
0
0
I wrote in another thread about salemen scaring and misinforming people into buying the P4. However I think that could be rather difficult in a showroom. The customer will ask:
&quot;well if what you say is really true, then why are there these other computers that are so much cheaper that people are buying?&quot;