Just how long are we going to keep playing the IPC game ?

ThatBuzzkiller

Golden Member
Nov 14, 2014
1,120
260
136
I'm sure Intel and to a lesser extent, AMD realize that they can only extract so much instruction level parallelism and increasing the IPC after that will no longer provide any meaningful performance gains.

IPC gains can come from wider SIMD extensions or just new ISA extensions in general, pipelining, out-of-order execution, and register renaming.

Out of that list, the most obvious way to increase IPC on future processors is extending the SIMD unit and I find it highly unlikely that a lot applications can benefit from a 16-wide execution unit even for high performance workloads that primarily run on the CPU. This is a big issue seeing as a lot of programs that run on the CPU aren't necessarily friendly to vectorization which leads to minimal or no gains in the end for moving towards a wider SIMD unit.

Pipelining and OoE does not solve does not resolve data dependencies in execution. In other words, a program still needs to prevent race conditions from happening to gain deterministic results when an operation is dependent upon a previous output.

Truth be told, I'm not very excited about the IPC gains we'll see from Skylake. My expectations are very low at the moment for the new microarchitecture.

A higher IPC translating to higher performance is starting to sound more and more like a charade as time goes on. Sooner or later I expect the "megahertz myth" to be replaced by the "IPC myth".
 

NTMBK

Lifer
Nov 14, 2011
10,461
5,845
136
IPC gains can come from wider SIMD extensions

Wrong. New instruction sets do not increase Instructions per Clock, it just changes what those instructions are. Haswell's IPC when executing AVX2 code is the same as it is executing SSE4.2 code- in fact it can be even lower, due to the increased demands on the memory subsystem increasing cache stalls. However, the total throughput is massively increased.
 

Soulkeeper

Diamond Member
Nov 23, 2001
6,738
156
106
I personally miss the "moahr cores!" game ...
But regardless of how we get there, at the end of the day, more performance and perf/watt is all that matters
 

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
I'm sure Intel and to a lesser extent, AMD realize that they can only extract so much instruction level parallelism and increasing the IPC after that will no longer provide any meaningful performance gains.
If they can get more, it will. In AMD's case, the if is very real, given their financials over the last many years.

IPC gains can come from wider SIMD extensions or just new ISA extensions in general, pipelining, out-of-order execution, and register renaming.
SIMD offers 0, yes, zero, IPC gains. Nada. Zip. Zilch. Pipelining may or may not, depending on other factors (pipelining is so we can have 4GHz CPUs, instead of 400MHz ones).

Pipelining and OoE does not solve does not resolve data dependencies in execution. In other words, a program still needs to prevent race conditions from happening to gain deterministic results when an operation is dependent upon a previous output.
Nothing at this low a level has anything at all to do with race conditions, nor any other way a program may be nondetermistic. Determinism is maintained by superscalar pipelined OOOE CPUs, just as if they executed on a 1-wide in-order non-pipelined CPU, within specified limits (very few CPU ISAs require exact order in committing changes to memory, and x86 has never been one of them).

A higher IPC translating to higher performance is starting to sound more and more like a charade as time goes on. Sooner or later I expect the "megahertz myth" to be replaced by the "IPC myth".
Just one problem: we have benchmarks galore showing otherwise, going back many years. Also, it's a theoretical impossibility, so long as the clock speeds are the same, that higher IPC does not translate into higher performance, because that's literally what higher IPC means.
 
Last edited:

sm625

Diamond Member
May 6, 2011
8,172
137
106
It is a constant fine grained process. There is always more IPC that can be extracted. If a pair of instructions take 5 clock cycles to execute, and your profiler says you will face those instructions enough to make it worth it, then they will spend a million transistors to shave off 1 clock cycle from that sequence of instructions. If we could stop time we could profile and optimize through hundreds of iterations. After all that, we would have a cpu that has twice the IPC of current cpus, even on the exact same process.
 

blake0812

Senior member
Feb 6, 2014
788
4
81
I'm sure Intel and to a lesser extent, AMD realize that they can only extract so much instruction level parallelism and increasing the IPC after that will no longer provide any meaningful performance gains.

IPC gains can come from wider SIMD extensions or just new ISA extensions in general, pipelining, out-of-order execution, and register renaming.

Out of that list, the most obvious way to increase IPC on future processors is extending the SIMD unit and I find it highly unlikely that a lot applications can benefit from a 16-wide execution unit even for high performance workloads that primarily run on the CPU. This is a big issue seeing as a lot of programs that run on the CPU aren't necessarily friendly to vectorization which leads to minimal or no gains in the end for moving towards a wider SIMD unit.

Pipelining and OoE does not solve does not resolve data dependencies in execution. In other words, a program still needs to prevent race conditions from happening to gain deterministic results when an operation is dependent upon a previous output.

Truth be told, I'm not very excited about the IPC gains we'll see from Skylake. My expectations are very low at the moment for the new microarchitecture.

A higher IPC translating to higher performance is starting to sound more and more like a charade as time goes on. Sooner or later I expect the "megahertz myth" to be replaced by the "IPC myth".

I could understand it better if I knew what all those terms meant. Why isn't there a glossary sticky? :confused:

Anyway, hopefully AMD will catch up in the IPC department parallel to intel's level.
 

witeken

Diamond Member
Dec 25, 2013
3,899
193
106
Wrong. New instruction sets do not increase Instructions per Clock, it just changes what those instructions are. Haswell's IPC when executing AVX2 code is the same as it is executing SSE4.2 code- in fact it can be even lower, due to the increased demands on the memory subsystem increasing cache stalls. However, the total throughput is massively increased.

It increases performance/clock, and that's what we all mean by IPC, don't we?
 

videogames101

Diamond Member
Aug 24, 2005
6,783
27
91
I'll be releasing a new CPU arch later this year with 100x the IPC of the newest i7.

It will also be running at 20MHz.




It's important to remember that the instructions per clock (IPC) of a CPU can very easily be increased by elongating the clock cycle and having deeper logic trees, and that higher IPC is not always the best way to increase performance because it often comes at the expense of clock speed. It's not as simple as just increasing IPC, you have to do it while keeping your cycle times low.

Fortunately, we still have Uncle Moore donating smaller gate delays every 2 years. Even if it seems like clock speeds are stalled, they aren't really. Intel/AMD are just choosing to keep the same clock speeds and increase IPC, which is probably the right choice from a power perspective.
 
Last edited:

NostaSeronx

Diamond Member
Sep 18, 2011
3,811
1,290
136
Efficient large IPS(instructions per second) is what consumers want.

80,000 at arb. 50 watt.

Is better than
100,000 at arb. 100 watt
or
60,000 at arb. 40 watt.
 

blake0812

Senior member
Feb 6, 2014
788
4
81
I personally think they should be coming up with more improved instructions, not just the amount of them to be executed in a second.
 

NTMBK

Lifer
Nov 14, 2011
10,461
5,845
136
It increases performance/clock, and that's what we all mean by IPC, don't we?

No. IPC is a technical term with a very specific meaning. Misusing IPC like that is literally as bad as saying that new SIMD extensions let us increase frequency.

If people want to use technical terms, use them correctly. Or don't use them at all.
 
Mar 10, 2006
11,715
2,012
126
No. IPC is a technical term with a very specific meaning. Misusing IPC like that is literally as bad as saying that new SIMD extensions let us increase frequency.

If people want to use technical terms, use them correctly. Or don't use them at all.

It's also funny to see people compare ARM and X86 "IPC" -- the "I" are not comparable.

What people really mean is "singlethreaded performance per clock."
 

NTMBK

Lifer
Nov 14, 2011
10,461
5,845
136
It's also funny to see people compare ARM and X86 "IPC" -- the "I" are not comparable.

What people really mean is "singlethreaded performance per clock."

Yup, it's like comparing a Renault which can do 160kph and a Ford which can do 100mph and concluding that the Renault is obviously superior. Units matter, people!
 

III-V

Senior member
Oct 12, 2014
678
1
41
Wrong. New instruction sets do not increase Instructions per Clock, it just changes what those instructions are. Haswell's IPC when executing AVX2 code is the same as it is executing SSE4.2 code- in fact it can be even lower, due to the increased demands on the memory subsystem increasing cache stalls. However, the total throughput is massively increased.
Well, that's basically what the term IPC has devolved into. It's now a term for per-clock performance. I don't think there's any going back, given its widespread misuse.
Truth be told, I'm not very excited about the IPC gains we'll see from Skylake. My expectations are very low at the moment for the new microarchitecture.
I think everyone's set you straight on the IPC thing, so I'll point out that 14nm desktop processors should see a return in clock scaling.
 
Last edited:

TuxDave

Lifer
Oct 8, 2002
10,571
3
71
Well, that's basically what the term IPC has devolved into. It's now a term for per-clock performance. I don't think there's any going back, given its widespread misuse.

I think everyone's set you straight on the IPC thing, so I'll point out that 14nm desktop processors should see a return in clock scaling.

That begs the question... should we just let it go or keep educating people...

... ;) :p
 

cytg111

Lifer
Mar 17, 2008
26,254
15,665
136
That begs the question... should we just let it go or keep educating people...

... ;) :p

Well, if you have the deep desire to smack *wrong* in someones face while the intended meaning was still quite easily deductible, then i'd say you should probably leave the teaching job to someone else. But thats just me :).
 

NTMBK

Lifer
Nov 14, 2011
10,461
5,845
136
This is technology, not politics. You can't redefine terms to fit your arguments.
 

witeken

Diamond Member
Dec 25, 2013
3,899
193
106
This is technology, not politics. You can't redefine terms to fit your arguments.

It's actually semantics (and democracy). If people don't find IPC useful in its strictest form and the majority wants to use it for an adjacent meaning instead, it will change meaning.
 

ThatBuzzkiller

Golden Member
Nov 14, 2014
1,120
260
136
Wrong. New instruction sets do not increase Instructions per Clock, it just changes what those instructions are. Haswell's IPC when executing AVX2 code is the same as it is executing SSE4.2 code- in fact it can be even lower, due to the increased demands on the memory subsystem increasing cache stalls. However, the total throughput is massively increased.

Let's not argue about the semantics for a moment here ...


As far as most people are concerned an operation is equivalent to an instruction for the most part and plus vector instructions in hindsight are like executing multiple identical scalar instructions.


Arguably, the bigger issue with AVX and AVX2 is the lack of or inefficient implementations of vector addressing operations and that irregular memory access patterns will harm the benefits of a SPMD execution paradigm.
 

ThatBuzzkiller

Golden Member
Nov 14, 2014
1,120
260
136
Nothing at this low a level has anything at all to do with race conditions, nor any other way a program may be nondetermistic. Determinism is maintained by superscalar pipelined OOOE CPUs, just as if they executed on a 1-wide in-order non-pipelined CPU, within specified limits (very few CPU ISAs require exact order in committing changes to memory, and x86 has never been one of them).

This doesn't even begin to make any sense ...


Your second line here is entirely false since that issue is handled by memory fences or the hardware and a lot of programmers describe the x86 memory model being strong.


Just one problem: we have benchmarks galore showing otherwise, going back many years. Also, it's a theoretical impossibility, so long as the clock speeds are the same, that higher IPC does not translate into higher performance, because that's literally what higher IPC means.


Increasing performance per clock in the past wasn't much of an issue when you consider that we were very far away from hitting the ILP wall but now that we've practically exhausted every innovation I don't think increasing the IPC will do much in the long run ...
 

NTMBK

Lifer
Nov 14, 2011
10,461
5,845
136
Let's not argue about the semantics for a moment here ...


As far as most people are concerned an operation is equivalent to an instruction for the most part and plus vector instructions in hindsight are like executing multiple identical scalar instructions.


Arguably, the bigger issue with AVX and AVX2 is the lack of or inefficient implementations of vector addressing operations and that irregular memory access patterns will harm the benefits of a SPMD execution paradigm.

It is a fundamentally different concept. Improving IPC improves performance on existing code, which new ISA extensions do not.