• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Just how long are we going to keep playing the IPC game ?

ThatBuzzkiller

Golden Member
I'm sure Intel and to a lesser extent, AMD realize that they can only extract so much instruction level parallelism and increasing the IPC after that will no longer provide any meaningful performance gains.

IPC gains can come from wider SIMD extensions or just new ISA extensions in general, pipelining, out-of-order execution, and register renaming.

Out of that list, the most obvious way to increase IPC on future processors is extending the SIMD unit and I find it highly unlikely that a lot applications can benefit from a 16-wide execution unit even for high performance workloads that primarily run on the CPU. This is a big issue seeing as a lot of programs that run on the CPU aren't necessarily friendly to vectorization which leads to minimal or no gains in the end for moving towards a wider SIMD unit.

Pipelining and OoE does not solve does not resolve data dependencies in execution. In other words, a program still needs to prevent race conditions from happening to gain deterministic results when an operation is dependent upon a previous output.

Truth be told, I'm not very excited about the IPC gains we'll see from Skylake. My expectations are very low at the moment for the new microarchitecture.

A higher IPC translating to higher performance is starting to sound more and more like a charade as time goes on. Sooner or later I expect the "megahertz myth" to be replaced by the "IPC myth".
 
IPC gains can come from wider SIMD extensions

Wrong. New instruction sets do not increase Instructions per Clock, it just changes what those instructions are. Haswell's IPC when executing AVX2 code is the same as it is executing SSE4.2 code- in fact it can be even lower, due to the increased demands on the memory subsystem increasing cache stalls. However, the total throughput is massively increased.
 
I personally miss the "moahr cores!" game ...
But regardless of how we get there, at the end of the day, more performance and perf/watt is all that matters
 
I'm sure Intel and to a lesser extent, AMD realize that they can only extract so much instruction level parallelism and increasing the IPC after that will no longer provide any meaningful performance gains.
If they can get more, it will. In AMD's case, the if is very real, given their financials over the last many years.

IPC gains can come from wider SIMD extensions or just new ISA extensions in general, pipelining, out-of-order execution, and register renaming.
SIMD offers 0, yes, zero, IPC gains. Nada. Zip. Zilch. Pipelining may or may not, depending on other factors (pipelining is so we can have 4GHz CPUs, instead of 400MHz ones).

Pipelining and OoE does not solve does not resolve data dependencies in execution. In other words, a program still needs to prevent race conditions from happening to gain deterministic results when an operation is dependent upon a previous output.
Nothing at this low a level has anything at all to do with race conditions, nor any other way a program may be nondetermistic. Determinism is maintained by superscalar pipelined OOOE CPUs, just as if they executed on a 1-wide in-order non-pipelined CPU, within specified limits (very few CPU ISAs require exact order in committing changes to memory, and x86 has never been one of them).

A higher IPC translating to higher performance is starting to sound more and more like a charade as time goes on. Sooner or later I expect the "megahertz myth" to be replaced by the "IPC myth".
Just one problem: we have benchmarks galore showing otherwise, going back many years. Also, it's a theoretical impossibility, so long as the clock speeds are the same, that higher IPC does not translate into higher performance, because that's literally what higher IPC means.
 
Last edited:
It is a constant fine grained process. There is always more IPC that can be extracted. If a pair of instructions take 5 clock cycles to execute, and your profiler says you will face those instructions enough to make it worth it, then they will spend a million transistors to shave off 1 clock cycle from that sequence of instructions. If we could stop time we could profile and optimize through hundreds of iterations. After all that, we would have a cpu that has twice the IPC of current cpus, even on the exact same process.
 
I'm sure Intel and to a lesser extent, AMD realize that they can only extract so much instruction level parallelism and increasing the IPC after that will no longer provide any meaningful performance gains.

IPC gains can come from wider SIMD extensions or just new ISA extensions in general, pipelining, out-of-order execution, and register renaming.

Out of that list, the most obvious way to increase IPC on future processors is extending the SIMD unit and I find it highly unlikely that a lot applications can benefit from a 16-wide execution unit even for high performance workloads that primarily run on the CPU. This is a big issue seeing as a lot of programs that run on the CPU aren't necessarily friendly to vectorization which leads to minimal or no gains in the end for moving towards a wider SIMD unit.

Pipelining and OoE does not solve does not resolve data dependencies in execution. In other words, a program still needs to prevent race conditions from happening to gain deterministic results when an operation is dependent upon a previous output.

Truth be told, I'm not very excited about the IPC gains we'll see from Skylake. My expectations are very low at the moment for the new microarchitecture.

A higher IPC translating to higher performance is starting to sound more and more like a charade as time goes on. Sooner or later I expect the "megahertz myth" to be replaced by the "IPC myth".

I could understand it better if I knew what all those terms meant. Why isn't there a glossary sticky? 😕

Anyway, hopefully AMD will catch up in the IPC department parallel to intel's level.
 
Wrong. New instruction sets do not increase Instructions per Clock, it just changes what those instructions are. Haswell's IPC when executing AVX2 code is the same as it is executing SSE4.2 code- in fact it can be even lower, due to the increased demands on the memory subsystem increasing cache stalls. However, the total throughput is massively increased.

It increases performance/clock, and that's what we all mean by IPC, don't we?
 
I'll be releasing a new CPU arch later this year with 100x the IPC of the newest i7.

It will also be running at 20MHz.




It's important to remember that the instructions per clock (IPC) of a CPU can very easily be increased by elongating the clock cycle and having deeper logic trees, and that higher IPC is not always the best way to increase performance because it often comes at the expense of clock speed. It's not as simple as just increasing IPC, you have to do it while keeping your cycle times low.

Fortunately, we still have Uncle Moore donating smaller gate delays every 2 years. Even if it seems like clock speeds are stalled, they aren't really. Intel/AMD are just choosing to keep the same clock speeds and increase IPC, which is probably the right choice from a power perspective.
 
Last edited:
Efficient large IPS(instructions per second) is what consumers want.

80,000 at arb. 50 watt.

Is better than
100,000 at arb. 100 watt
or
60,000 at arb. 40 watt.
 
I personally think they should be coming up with more improved instructions, not just the amount of them to be executed in a second.
 
It increases performance/clock, and that's what we all mean by IPC, don't we?

No. IPC is a technical term with a very specific meaning. Misusing IPC like that is literally as bad as saying that new SIMD extensions let us increase frequency.

If people want to use technical terms, use them correctly. Or don't use them at all.
 
No. IPC is a technical term with a very specific meaning. Misusing IPC like that is literally as bad as saying that new SIMD extensions let us increase frequency.

If people want to use technical terms, use them correctly. Or don't use them at all.

It's also funny to see people compare ARM and X86 "IPC" -- the "I" are not comparable.

What people really mean is "singlethreaded performance per clock."
 
It's also funny to see people compare ARM and X86 "IPC" -- the "I" are not comparable.

What people really mean is "singlethreaded performance per clock."

Yup, it's like comparing a Renault which can do 160kph and a Ford which can do 100mph and concluding that the Renault is obviously superior. Units matter, people!
 
Wrong. New instruction sets do not increase Instructions per Clock, it just changes what those instructions are. Haswell's IPC when executing AVX2 code is the same as it is executing SSE4.2 code- in fact it can be even lower, due to the increased demands on the memory subsystem increasing cache stalls. However, the total throughput is massively increased.
Well, that's basically what the term IPC has devolved into. It's now a term for per-clock performance. I don't think there's any going back, given its widespread misuse.
Truth be told, I'm not very excited about the IPC gains we'll see from Skylake. My expectations are very low at the moment for the new microarchitecture.
I think everyone's set you straight on the IPC thing, so I'll point out that 14nm desktop processors should see a return in clock scaling.
 
Last edited:
Well, that's basically what the term IPC has devolved into. It's now a term for per-clock performance. I don't think there's any going back, given its widespread misuse.

I think everyone's set you straight on the IPC thing, so I'll point out that 14nm desktop processors should see a return in clock scaling.

That begs the question... should we just let it go or keep educating people...

... 😉 😛
 
That begs the question... should we just let it go or keep educating people...

... 😉 😛

Well, if you have the deep desire to smack *wrong* in someones face while the intended meaning was still quite easily deductible, then i'd say you should probably leave the teaching job to someone else. But thats just me 🙂.
 
This is technology, not politics. You can't redefine terms to fit your arguments.

It's actually semantics (and democracy). If people don't find IPC useful in its strictest form and the majority wants to use it for an adjacent meaning instead, it will change meaning.
 
Wrong. New instruction sets do not increase Instructions per Clock, it just changes what those instructions are. Haswell's IPC when executing AVX2 code is the same as it is executing SSE4.2 code- in fact it can be even lower, due to the increased demands on the memory subsystem increasing cache stalls. However, the total throughput is massively increased.

Let's not argue about the semantics for a moment here ...


As far as most people are concerned an operation is equivalent to an instruction for the most part and plus vector instructions in hindsight are like executing multiple identical scalar instructions.


Arguably, the bigger issue with AVX and AVX2 is the lack of or inefficient implementations of vector addressing operations and that irregular memory access patterns will harm the benefits of a SPMD execution paradigm.
 
Nothing at this low a level has anything at all to do with race conditions, nor any other way a program may be nondetermistic. Determinism is maintained by superscalar pipelined OOOE CPUs, just as if they executed on a 1-wide in-order non-pipelined CPU, within specified limits (very few CPU ISAs require exact order in committing changes to memory, and x86 has never been one of them).

This doesn't even begin to make any sense ...


Your second line here is entirely false since that issue is handled by memory fences or the hardware and a lot of programmers describe the x86 memory model being strong.


Just one problem: we have benchmarks galore showing otherwise, going back many years. Also, it's a theoretical impossibility, so long as the clock speeds are the same, that higher IPC does not translate into higher performance, because that's literally what higher IPC means.


Increasing performance per clock in the past wasn't much of an issue when you consider that we were very far away from hitting the ILP wall but now that we've practically exhausted every innovation I don't think increasing the IPC will do much in the long run ...
 
Let's not argue about the semantics for a moment here ...


As far as most people are concerned an operation is equivalent to an instruction for the most part and plus vector instructions in hindsight are like executing multiple identical scalar instructions.


Arguably, the bigger issue with AVX and AVX2 is the lack of or inefficient implementations of vector addressing operations and that irregular memory access patterns will harm the benefits of a SPMD execution paradigm.

It is a fundamentally different concept. Improving IPC improves performance on existing code, which new ISA extensions do not.
 
Back
Top