why are the clock frequencies different?

The Pentium Guy

Diamond Member
Jan 15, 2005
4,327
1
0
You might want to use the Search button :p. You're a senior member and you don't know this? Intel and AMD use different IPCs (instructions per cycle)
Performance = Frequency x IPC

I beleive AMD uses an IPC of 9 and intel uses an IPC of 6, this is why a 2 ghz amd is comparable to a 3ghz intel (2x1.5 = 3)
 

theMan

Diamond Member
Mar 17, 2005
4,386
0
0
a 2.4 ghz a64 will kill a 3.8ghz pentium4 in almost everything, especially gaming. what that number means, is the number of clocks that the cpu does in a second. BUT, not every clock is the same. the A64's do 9 instructions per clock (IPC's) and the P4 only does 6. so, in each cycle, the A64 does more, so it doesnt need to go as fast. its like a water pipe. one pipe is 6 inches wide, and one is 9. to get the same water output, the 9 the water in the 9 inch pipe doesnt need to be going as fast.
 

Lonyo

Lifer
Aug 10, 2002
21,938
6
81
^^
One does more work per clock than the other, so it can perform equally well despite lower clock speeds.
But this approach makes it harder to increase clock frequency, so it's a trade off.
 

theMan

Diamond Member
Mar 17, 2005
4,386
0
0
but, even though its harder to increase the clock, a 6mhz overclock on an a64 is equals a 9mhz overclock on a p4. so if they both overclock the same % the a64 will gain way more.
 

Jeff7181

Lifer
Aug 21, 2002
18,368
11
81
Another victim of the MHz myth. :)

The frequency isn't the one and only factor that determines performance. MANY things contribute to it. The Pentium Guy's explanation is good enough. As I said, many things effect performance... if you really want to know, I suggest you head over to Arstechnica.com and check out their articles on CPU's. Read the ones about pipelining, super-threading, multi-threading, hyper-threading, the K8/Hammer core, the Prescott core. And the one about the Cell is also an interesting read. Browsing these forums looking for all the AMD vs. Intel threads is another way to find more detailed info.
 

JDCentral

Senior member
Jul 14, 2004
372
0
0
Also.. IPCs are also an incorrect statement of performace... because the bottom line is that no CPU will ALWAYS run at it's rated IPCs.

Those might be maxes, or maybe the 'average' IPCs.

Since both chips are pipelined (and both pipelines are very deep), there is probably a significant amount of time spent 'backing out' instructions, and whatnot. Also... the deeper the pipeline, the more time each INDIVIDUAL instruction takes to move through the pipeline. And, a deep pipeline is hard(er) to keep full, so you might not be getting that many IPCs...but you can crank up those MHz higher with a deeper pipeline. (Not sure if the the instructions move through the ENTIRE pipeline once per cycle, or if it's just one stage per clock cycle - can somebody confirm which? I would think it would be one stage per cycle..)

Also... if the CPU finds an instruction that depends on another one... well, it has no option except to run them in order. And if it's a divide instruction (GOD NO!!!) it's going to take many times longer than a simple add.

hehe... CPU performance is never 'cut and dry', and depend heavily on what you're using them for. Yes, they're general purpose, but the bottom line is the Intel will be better at some things... and AMD will be better in others. MHz is just how fast those asynchronous circuit elements (registers n' flip-flops) update themselves.

EDIT: Also.. floating point instructions are entirely different - I guess the P4's FPU can only do about 1 IPC.. 'simple' integer arithmetic can be done at a much faster rate, while 'complex' ones take a little longer (but faster than F.P.)
 

Jeff7181

Lifer
Aug 21, 2002
18,368
11
81
One stage per cycle. The Arstechnica Pipelining articles illustrate exactly what goes on VERY well.
 

BitByBit

Senior member
Jan 2, 2005
474
2
81
Back in the days before pipelining, a clock cycle was the entire progression of an instruction through the four main stages:
Fetch -> Decode -> Execute -> Store.
There was only ever one instruction being worked on at any one time, meaning that in the case of very early x86 processors, only 1/4 of the processor's resources were being used in any instance.
Because the processor could only work on one instruction at a time, increasing the instruction latency through increasing the number of stages in the processor core would have had a direct impact on performance, since performance in non-pipelined processors was entirely determined by the time it took to execute each instruction.
The longer an instruction took to execute, the longer the delay in executing other instructions.
Pipelining completely changed all of this.
Now, instruction latency does not have a direct impact on performance, and it can be said that there is no direct relationship between IPC and pipeline depth!. With pipelining, an instruction doesn't have to wait for a previous instruction to clear the pipeline; instructions now enter the pipeline one after the other, which obviously allows a far more efficient use of execution resources.
Previously, with non-pipelined processors, a single clock cycle was the progression of an instruction through all the processor's execution stages, so an instruction was completed evey clock cycle.
With pipelined processors, an instruction is completed every clock pulse once the pipeline is full, which is why a clock pulse is now referred to as a clock cycle.
As I've stated above, one of the biggest misconceptions is that pipeline depth directly affects a processor's IPC - the number of instructions it can execute per clock.
It doesn't. Looked at from a theoretical perspective, increasing the number of pipeline stages has absolutely no effect on the maximum throughput a processor is capable of, thanks to pipelining.
Doubling the number of pipeline stages does not halve IPC, but it does have an impact on instruction latency - the time it takes to flush and refill the pipeline. If there are lots of pipeline refills (often as a result of a mispredicted branch), then performance is going to suffer. However, if improvements are made to the branch predictor, then the impact of this can be minimalised.
If we take a look at Precott vs. Northwood benchmarks, we see that the two are always within a few percent of eachother, with Prescott occasionally ahead, despite it having a 55% deeper pipeline (and a 55% higher instruction latency at the same clock speed).
What should be mentioned here, is that the whole point of implementing more pipeline stages is to allow each stage to be executed more quickly. If we can halve the time it takes an instruction to complete each pipeline stage, we can double clock speed, which doubles our maximum theoretical throughput, while in this case acheiving the same instruction latency(!).
What effects IPC directly is execution width - the number of instructions that can be executed in parallel.
If we double the number of execution units, we double the theoretical IPC maximum.
However, adding additional execution units is no simple matter. It requires additional logic to find instructions the execution core can execute in parallel, since x86 code itself is seldom written in a way that makes life easy for multiple execution units.
This is especially true with integer code. Some instructions require the results of previous instructions, and cannot be execute in parallel.

Intel and AMD took two different approaches to processor design.
Intel took a serial approach, AMD a parallel approach. The serial method allows higher clock speeds, while the parallel method allows higher IPC.
Thus, the clock speeds of these two architectures cannot be compared.
 

govtcheez75

Platinum Member
Aug 13, 2002
2,932
0
76
how about the soon to be extinct PowerPC processors? ;)

....They are RISC chips. I forget what I read about their performance levels when compared to the Intel Mhz.
 

mindgam3

Member
May 30, 2005
166
0
0
Soon to be extinct?? I think not sir dont forget about the huge deals ibm has with microsoft and nintendo .. they are supplying the next gen consoles with cpu chips which will be millions of chips!
 

JDCentral

Senior member
Jul 14, 2004
372
0
0
WHAT?!?!

PowerPC is nowhere near extinct!
If I recall correctly, something like 4 out of the top 10 supercomputers in the world run PowerPC chips.

Just because Apple is running from PPC doesn't mean the rest of the world is.
The G4 is an excellent chip to use if you're packing thousands of them into the room...

EDIT: RISC and CISC are basically the same thing, now-adays. If it's labed as RISC, it was DESIGNED as RISC, but has a lot of CISC properties... and visa-versa. Unless you've got some embedded system type of thing going.

EDIT2: Does a deeper pipeline indirectly create less IPC (Averaged over time)? If you mis-predict a branch (approx 50% of the time?) then don't you need to back out more instructions with a deeper pipeline? Which means that you, essencially, spend 20 cycles backing out, and re-filling the pipe (Assuming pipeline of 20 steps)? Also.. .if an instruction MUST go through (and complete) before any others, doesn't that also effect the IPC, as you're getting something like 1/20 IPC for that instruction (Again.. 20 stage pipe)?
 

BitByBit

Senior member
Jan 2, 2005
474
2
81
Originally posted by: JDCentral
EDIT2: Does a deeper pipeline indirectly create less IPC (Averaged over time)? If you mis-predict a branch (approx 50% of the time?) then don't you need to back out more instructions with a deeper pipeline? Which means that you, essencially, spend 20 cycles backing out, and re-filling the pipe (Assuming pipeline of 20 steps)?

A deeper pipeline does have a negative impact on IPC, especially if the branch predictor hasn't been improved accordingly. But it does not, however, cause a colossal drop in IPC as many claim, and it is a common misconception that the Athlon has a higher IPC than the P4 because it has a shorter pipeline.
If IPC and pipeline depth were directly related, then no improvement to the branch predictor or other logic could ever compensate, and Prescott would be a truly dismal performer.
On the subject of branch misprediction, it is true as you say that the deeper the pipeline, the more costly a flush and refill is. But I think the idea of branches mis-predicted 50% of the time is a little ill-conceived, and if you read up on Prescott, you'll see that its branch predictor is > 99% accurate. This means that branch mis-prediction occurs < 1% of the time.

Also.. .if an instruction MUST go through (and complete) before any others, doesn't that also effect the IPC, as you're getting something like 1/20 IPC for that instruction (Again.. 20 stage pipe)?

No, since the number of clock cycles spent executing instructions is unchanged, and a stall results in the same number of wasted clock cycles.
The only real impact comes int the form of flushing and refilling the pipeline.


 

JDCentral

Senior member
Jul 14, 2004
372
0
0
Thanks for the clarification... the simple 'MIPS' architecture stuff taught in the intro Comp Arch. classes is quite different than what is in the x86 world ;-).

I didn't think it had a MASSIVE impact on IPCs/performance - but it kindof seemed like it should have.
 

Fox5

Diamond Member
Jan 31, 2005
5,957
7
81
Originally posted by: The Pentium Guy
You might want to use the Search button :p. You're a senior member and you don't know this? Intel and AMD use different IPCs (instructions per cycle)
Performance = Frequency x IPC

I beleive AMD uses an IPC of 9 and intel uses an IPC of 6, this is why a 2 ghz amd is comparable to a 3ghz intel (2x1.5 = 3)

There's more to performance than the IPC though. The G5 has higher IPC than athlon yet performs worse, and the P3 and PM have lower yet can perform better per mhz.

It really more comes down to the efficency of the system as a whole, how well the cpu is doing its job, and how fast communication with memory is.
 

JDCentral

Senior member
Jul 14, 2004
372
0
0
Originally posted by: Fox5
Originally posted by: The Pentium Guy
how fast communication with memory is.

Memory is something like 10-100 times slower than the CPU... so when a CPU has to wait for memory, it's basically 'forever' from the processors point of view.

However... modern caching systems only miss about 1% of the time, so it's not that big of a deal.
 
Jun 11, 2005
70
0
0
not exactly sure of the term but it has somthing to do with the pipeline being far shorter in the AMD processor that in the Intel one...that and the work per clock
 

Dothan

Banned
Jun 5, 2005
90
0
0
everyone knows AMD use this ridiculous + to try and confuse people

Intel has the MHz and GHz advantage !!!

A 2.4GHz AMD is not equal to 3GHz of P4 power !!!
 

MDme

Senior member
Aug 27, 2004
297
0
0
yes, I agree with dothan 100% a 2.4Ghz A64 is not equal to a 3.0Ghz P4. It's actually closer to 3.4-3.6Ghz P4 actually.
 

MDme

Senior member
Aug 27, 2004
297
0
0
Originally posted by: Dothan
everyone knows AMD use this ridiculous + to try and confuse people

Intel has the MHz and GHz advantage !!!

A 2.4GHz AMD is not equal to 3GHz of P4 power !!!

yes, I agree with dothan 100% a 2.4Ghz A64 is not equal to a 3.0Ghz P4. It's actually closer to 3.4-3.6Ghz P4 actually.


 

Dothan

Banned
Jun 5, 2005
90
0
0
Originally posted by: MDme
yes, I agree with dothan 100% a 2.4Ghz A64 is not equal to a 3.0Ghz P4. It's actually closer to 3.4-3.6Ghz P4 actually.

double-posting little troll !!!

don't you wish !!! My Prescott-2 660 will DEMOLISH Athlon 64 !!!

AMD fanatic never admit the truth !!!

Let us have a benchmark duel to settle this if you wish !!!

 

sangyup81

Golden Member
Feb 22, 2005
1,082
1
81
Dothan, you have never posted benchmark or a screenshot in any of your 25 posts so far. Why don't you take your Prescott and start a thread to challenge everyone instead of making claims everytime anyone insults a Pentium 4.
 

JDCentral

Senior member
Jul 14, 2004
372
0
0
Originally posted by: Dothan
everyone knows AMD use this ridiculous + to try and confuse people

Intel has the MHz and GHz advantage !!!

A 2.4GHz AMD is not equal to 3GHz of P4 power !!!

are you... on crack?