Just how long are we going to keep playing the IPC game ?

ThatBuzzkiller · Nov 17, 2014

I'm sure Intel and to a lesser extent, AMD realize that they can only extract so much instruction level parallelism and increasing the IPC after that will no longer provide any meaningful performance gains.

IPC gains can come from wider SIMD extensions or just new ISA extensions in general, pipelining, out-of-order execution, and register renaming.

Out of that list, the most obvious way to increase IPC on future processors is extending the SIMD unit and I find it highly unlikely that a lot applications can benefit from a 16-wide execution unit even for high performance workloads that primarily run on the CPU. This is a big issue seeing as a lot of programs that run on the CPU aren't necessarily friendly to vectorization which leads to minimal or no gains in the end for moving towards a wider SIMD unit.

Pipelining and OoE does not solve does not resolve data dependencies in execution. In other words, a program still needs to prevent race conditions from happening to gain deterministic results when an operation is dependent upon a previous output.

Truth be told, I'm not very excited about the IPC gains we'll see from Skylake. My expectations are very low at the moment for the new microarchitecture.

A higher IPC translating to higher performance is starting to sound more and more like a charade as time goes on. Sooner or later I expect the "megahertz myth" to be replaced by the "IPC myth".

NTMBK · Nov 17, 2014

ThatBuzzkiller said:
IPC gains can come from wider SIMD extensions

Wrong. New instruction sets do not increase Instructions per Clock, it just changes what those instructions are. Haswell's IPC when executing AVX2 code is the same as it is executing SSE4.2 code- in fact it can be even lower, due to the increased demands on the memory subsystem increasing cache stalls. However, the total throughput is massively increased.

Soulkeeper · Nov 17, 2014

I personally miss the "moahr cores!" game ...
But regardless of how we get there, at the end of the day, more performance and perf/watt is all that matters

Cerb · Nov 17, 2014

ThatBuzzkiller said:
I'm sure Intel and to a lesser extent, AMD realize that they can only extract so much instruction level parallelism and increasing the IPC after that will no longer provide any meaningful performance gains.

If they can get more, it will. In AMD's case, the if is very real, given their financials over the last many years.

IPC gains can come from wider SIMD extensions or just new ISA extensions in general, pipelining, out-of-order execution, and register renaming.

SIMD offers 0, yes, zero, IPC gains. Nada. Zip. Zilch. Pipelining may or may not, depending on other factors (pipelining is so we can have 4GHz CPUs, instead of 400MHz ones).

Pipelining and OoE does not solve does not resolve data dependencies in execution. In other words, a program still needs to prevent race conditions from happening to gain deterministic results when an operation is dependent upon a previous output.

Nothing at this low a level has anything at all to do with race conditions, nor any other way a program may be nondetermistic. Determinism is maintained by superscalar pipelined OOOE CPUs, just as if they executed on a 1-wide in-order non-pipelined CPU, within specified limits (very few CPU ISAs require exact order in committing changes to memory, and x86 has never been one of them).

A higher IPC translating to higher performance is starting to sound more and more like a charade as time goes on. Sooner or later I expect the "megahertz myth" to be replaced by the "IPC myth".

Just one problem: we have benchmarks galore showing otherwise, going back many years. Also, it's a theoretical impossibility, so long as the clock speeds are the same, that higher IPC does not translate into higher performance, because that's literally what higher IPC means.

sm625 · Nov 17, 2014

It is a constant fine grained process. There is always more IPC that can be extracted. If a pair of instructions take 5 clock cycles to execute, and your profiler says you will face those instructions enough to make it worth it, then they will spend a million transistors to shave off 1 clock cycle from that sequence of instructions. If we could stop time we could profile and optimize through hundreds of iterations. After all that, we would have a cpu that has twice the IPC of current cpus, even on the exact same process.

blake0812 · Nov 17, 2014

ThatBuzzkiller said:
I'm sure Intel and to a lesser extent, AMD realize that they can only extract so much instruction level parallelism and increasing the IPC after that will no longer provide any meaningful performance gains.

IPC gains can come from wider SIMD extensions or just new ISA extensions in general, pipelining, out-of-order execution, and register renaming.

Out of that list, the most obvious way to increase IPC on future processors is extending the SIMD unit and I find it highly unlikely that a lot applications can benefit from a 16-wide execution unit even for high performance workloads that primarily run on the CPU. This is a big issue seeing as a lot of programs that run on the CPU aren't necessarily friendly to vectorization which leads to minimal or no gains in the end for moving towards a wider SIMD unit.

Pipelining and OoE does not solve does not resolve data dependencies in execution. In other words, a program still needs to prevent race conditions from happening to gain deterministic results when an operation is dependent upon a previous output.

Truth be told, I'm not very excited about the IPC gains we'll see from Skylake. My expectations are very low at the moment for the new microarchitecture.

A higher IPC translating to higher performance is starting to sound more and more like a charade as time goes on. Sooner or later I expect the "megahertz myth" to be replaced by the "IPC myth".

I could understand it better if I knew what all those terms meant. Why isn't there a glossary sticky? 😕

Anyway, hopefully AMD will catch up in the IPC department parallel to intel's level.

witeken · Nov 17, 2014

NTMBK said:
Wrong. New instruction sets do not increase Instructions per Clock, it just changes what those instructions are. Haswell's IPC when executing AVX2 code is the same as it is executing SSE4.2 code- in fact it can be even lower, due to the increased demands on the memory subsystem increasing cache stalls. However, the total throughput is massively increased.

It increases performance/clock, and that's what we all mean by IPC, don't we?

videogames101 · Nov 17, 2014

I'll be releasing a new CPU arch later this year with 100x the IPC of the newest i7.

It will also be running at 20MHz.

It's important to remember that the instructions per clock (IPC) of a CPU can very easily be increased by elongating the clock cycle and having deeper logic trees, and that higher IPC is not always the best way to increase performance because it often comes at the expense of clock speed. It's not as simple as just increasing IPC, you have to do it while keeping your cycle times low.

Fortunately, we still have Uncle Moore donating smaller gate delays every 2 years. Even if it seems like clock speeds are stalled, they aren't really. Intel/AMD are just choosing to keep the same clock speeds and increase IPC, which is probably the right choice from a power perspective.

NostaSeronx · Nov 17, 2014

Efficient large IPS(instructions per second) is what consumers want.

80,000 at arb. 50 watt.

Is better than
100,000 at arb. 100 watt
or
60,000 at arb. 40 watt.

blake0812 · Nov 17, 2014

I personally think they should be coming up with more improved instructions, not just the amount of them to be executed in a second.

NTMBK · Nov 17, 2014

witeken said:
It increases performance/clock, and that's what we all mean by IPC, don't we?

No. IPC is a technical term with a very specific meaning. Misusing IPC like that is literally as bad as saying that new SIMD extensions let us increase frequency.

If people want to use technical terms, use them correctly. Or don't use them at all.

Arachnotronic · Nov 17, 2014

NTMBK said:
No. IPC is a technical term with a very specific meaning. Misusing IPC like that is literally as bad as saying that new SIMD extensions let us increase frequency.

If people want to use technical terms, use them correctly. Or don't use them at all.

It's also funny to see people compare ARM and X86 "IPC" -- the "I" are not comparable.

What people really mean is "singlethreaded performance per clock."

NTMBK · Nov 17, 2014

Intel17 said:
It's also funny to see people compare ARM and X86 "IPC" -- the "I" are not comparable.

What people really mean is "singlethreaded performance per clock."

Yup, it's like comparing a Renault which can do 160kph and a Ford which can do 100mph and concluding that the Renault is obviously superior. Units matter, people!

III-V · Nov 17, 2014

NTMBK said:
Wrong. New instruction sets do not increase Instructions per Clock, it just changes what those instructions are. Haswell's IPC when executing AVX2 code is the same as it is executing SSE4.2 code- in fact it can be even lower, due to the increased demands on the memory subsystem increasing cache stalls. However, the total throughput is massively increased.

Well, that's basically what the term IPC has devolved into. It's now a term for per-clock performance. I don't think there's any going back, given its widespread misuse.

ThatBuzzkiller said:
Truth be told, I'm not very excited about the IPC gains we'll see from Skylake. My expectations are very low at the moment for the new microarchitecture.

I think everyone's set you straight on the IPC thing, so I'll point out that 14nm desktop processors should see a return in clock scaling.

TuxDave · Nov 17, 2014

III-V said:
Well, that's basically what the term IPC has devolved into. It's now a term for per-clock performance. I don't think there's any going back, given its widespread misuse.

I think everyone's set you straight on the IPC thing, so I'll point out that 14nm desktop processors should see a return in clock scaling.

That begs the question... should we just let it go or keep educating people...

... 😉 😛

cytg111 · Nov 17, 2014

TuxDave said:
That begs the question... should we just let it go or keep educating people...

... 😉 😛

Well, if you have the deep desire to smack *wrong* in someones face while the intended meaning was still quite easily deductible, then i'd say you should probably leave the teaching job to someone else. But thats just me 🙂.

NTMBK · Nov 17, 2014

This is technology, not politics. You can't redefine terms to fit your arguments.

witeken · Nov 17, 2014

NTMBK said:
This is technology, not politics. You can't redefine terms to fit your arguments.

It's actually semantics (and democracy). If people don't find IPC useful in its strictest form and the majority wants to use it for an adjacent meaning instead, it will change meaning.

jhu · Nov 17, 2014

TuxDave said:
That begs the question... should we just let it go or keep educating people...

... 😉 😛

I see what you did there

Nothingness · Nov 17, 2014

NTMBK said:
This is technology, not politics. You can't redefine terms to fit your arguments.

The next step will be to claim frequency = performance... Oh wait some company already tried :ninja:

ThatBuzzkiller · Nov 17, 2014

NTMBK said:
Wrong. New instruction sets do not increase Instructions per Clock, it just changes what those instructions are. Haswell's IPC when executing AVX2 code is the same as it is executing SSE4.2 code- in fact it can be even lower, due to the increased demands on the memory subsystem increasing cache stalls. However, the total throughput is massively increased.

Let's not argue about the semantics for a moment here ...

As far as most people are concerned an operation is equivalent to an instruction for the most part and plus vector instructions in hindsight are like executing multiple identical scalar instructions.

Arguably, the bigger issue with AVX and AVX2 is the lack of or inefficient implementations of vector addressing operations and that irregular memory access patterns will harm the benefits of a SPMD execution paradigm.

witeken · Nov 17, 2014

jhu said:
I see what you did there

What did he do?

ThatBuzzkiller · Nov 17, 2014

Cerb said:
Nothing at this low a level has anything at all to do with race conditions, nor any other way a program may be nondetermistic. Determinism is maintained by superscalar pipelined OOOE CPUs, just as if they executed on a 1-wide in-order non-pipelined CPU, within specified limits (very few CPU ISAs require exact order in committing changes to memory, and x86 has never been one of them).

This doesn't even begin to make any sense ...

Your second line here is entirely false since that issue is handled by memory fences or the hardware and a lot of programmers describe the x86 memory model being strong.

Cerb said:
Just one problem: we have benchmarks galore showing otherwise, going back many years. Also, it's a theoretical impossibility, so long as the clock speeds are the same, that higher IPC does not translate into higher performance, because that's literally what higher IPC means.

Increasing performance per clock in the past wasn't much of an issue when you consider that we were very far away from hitting the ILP wall but now that we've practically exhausted every innovation I don't think increasing the IPC will do much in the long run ...

NTMBK · Nov 17, 2014

ThatBuzzkiller said:
Let's not argue about the semantics for a moment here ...

As far as most people are concerned an operation is equivalent to an instruction for the most part and plus vector instructions in hindsight are like executing multiple identical scalar instructions.

Arguably, the bigger issue with AVX and AVX2 is the lack of or inefficient implementations of vector addressing operations and that irregular memory access patterns will harm the benefits of a SPMD execution paradigm.

It is a fundamentally different concept. Improving IPC improves performance on existing code, which new ISA extensions do not.

Atreidin · Nov 17, 2014

witeken said:
What did he do?

He used the phrase "begs the question" incorrectly (it does not mean "to raise the question"); on purpose, I presume.

Just how long are we going to keep playing the IPC game ?

Golden Member

Lifer

Diamond Member

Elite Member

Diamond Member

Senior member

Diamond Member

Diamond Member

Diamond Member

Senior member

Lifer

Lifer

Lifer

Senior member

Lifer

Lifer

Lifer

Diamond Member

Lifer

Diamond Member

Golden Member

Diamond Member

Golden Member

Lifer

Senior member