Two misconceptions about IPC and GPUs

renderstate · Jun 11, 2016

The term IPC (Instructions Per Cycle) is misunderstood and misused a lot when it is applied to GPUs.

It originated in the CPU world where it indicates the average number of instructions executed per clock per core for a given workload. While the complexity of a modern CPU architecture cannot be captured by a single number, IPC remains a succinct and fairly intuitive way to track how good a core is. It's not always very useful to compare cores, especially if they don't use the same ISA.

The first problem with applying it to GPUs comes from the fact that unlike CPUs there are so many other (often independent) units that concur at keeping the GPU busy. A typical example is shadow map rendering where the GPU cores are idle most of the time, while other subsystems, such us rasterizers and ROPs, can be fully loaded. In such scenario IPC tells us close to nothing about our GPU.

To make things worse a decreased IPC in a new architecture might signal an improvement of the cores, while the rest of the system has not been improved. In the aforementioned shadow map rendering case the cores could be idling even more but it would an error to consider this an issue or a sign of poorer performance/product. This the first misconception about IPC and it would be preferable to come up with a new GOU-only unit of work done per unit time. This goes beyond the scope of this post and from now I'll assume that IPC applied to GPUs is such a new metric. I suspect most are already using the term IPC this way anyway, without giving it much thought, although it is important to understand what it means and where it comes from.

The second and far worse misconception is about IPC and GPU frequency. I see a lot of posts where IPC is computed as GPU performance per clock in a given workload (1st misconception, it's acceptable, not big deal) and compared across different architectures running at completely different frequency. Invariably the higher clocked GPU shows lower "IPC" and people boldly claim "IPC went down.. This is bad, company X sucks, etc". *Too bad this comparison is completely and utterly meaningless because IPC is almost always inversely proportional to frequency.* This is really straightforward to understand since memories don't scale up like cores do, and as we increase core frequency the likelihood of our cores (and other GPU units) starving for data and stalling increases (ergo IPC goes down..).

Let me repeat it. IPC is a function of frequency. If frequency goes up IPC will likely go down, especially at higher frequencies. If you think a GPU architecture is worse because, in this context, IPC goes down, you are swapping cause with effect and you are sooo wrong 🙂

Of course GPU architects can modify the GPU to lower the cores IPC in order to scale up frequency but you can't prove this by testing at different frequency and with different memory type and memory bandwidth. This is not always possible but to demonstrate IPC was lowered "on purpose", ideally you have to compare different GPU cores running the same workload, at the same frequency, with the same memory bandwidth (and memory type too).

I don't think what I wrote here is going to change the way people misuse IPC all the time but when it happens I (and you) can point them to this post 🙂

Mopetar · Jun 11, 2016

So basically Pascal made sacrifices for higher-clock speeds, which everyone else acknowledged and was generally okay with because performance is performance regardless of how you earn it, but there's now a way to rationalize it so it's a good thing?

No one except for hardcore AMD fans looking to rib Nvidia fans really cared. What was annoying to everyone was the Nvidia fans that couldn't bear to admit the truth.

kraatus77 · Jun 11, 2016

Mopetar said:
So basically Pascal made sacrifices for higher-clock speeds, which everyone else acknowledged and was generally okay with because performance is performance regardless of how you earn it, but there's now a way to rationalize it so it's a good thing?

No one except for hardcore AMD fans looking to rib Nvidia fans really cared. What was annoying to everyone was the Nvidia fans that couldn't bear to admit the truth.

this, i have already said many times, ipc regression or even same ipc isn't a bad thing as long as you are increasing #cores and or frequency. the end result is what matters.

and to op, typing IPC is way easier than typing performance per teraflops or per core performance etc.

Mopetar · Jun 11, 2016

kraatus77 said:
this, i have already said many times, ipc regression or even same ipc isn't a bad thing as long as you are increasing #cores and or frequency. the end result is what matters.

and to op, typing IPC is way easier than typing performance per teraflops or per core performance etc.

Like I said, no one really cared except fans of both camps. AMD fans were just looking for some mud to sling and Nvidia fans would rather admit to being closet pedophiles than Nvidia doing anything that could be construed as bad in some manner.

Now that the proper mental contortions have been done, we can all just put it to rest.

boozzer · Jun 11, 2016

Mopetar said:
Like I said, no one really cared except fans of both camps. AMD fans were just looking for some mud to sling and Nvidia fans would rather admit to being closet pedophiles than Nvidia doing anything that could be construed as bad in some manner.

Now that the proper mental contortions have been done, we can all just put it to rest.

hahahahaa 😀

ShintaiDK · Jun 11, 2016

Great post renderstate!

sirmo · Jun 11, 2016

Netburst was pretty quick in its hay day too. Everything is a trade-off.

IEC · Jun 11, 2016

Mopetar said:
Like I said, no one really cared except fans of both camps. AMD fans were just looking for some mud to sling and Nvidia fans would rather admit to being closet pedophiles than Nvidia doing anything that could be construed as bad in some manner.

Now that the proper mental contortions have been done, we can all just put it to rest.

😀

renderstate · Jun 11, 2016

ShintaiDK said:
Great post renderstate!

Thanks, but obviously the usual suspects haven't understood a word of it. Perhaps they are pretending to not understand it, but obviously they haven't lost a moment to turn what I said in their favor. Frankly I am not even surprised anymore by people that elect to live their lives like that.

The open minded ones are welcome to discuss and disagree with what I wrote of course but I am not wasting time with trolls anymore.

Azix · Jun 11, 2016

people are usually thinking same frequency aren't they?

BFG10K · Jun 11, 2016

sirmo said:
Netburst was pretty quick in its hay day too.

Actually it was terribly slow and extremely power-hungry. That's why they completely scrapped it with Conroe and have never looked back.

BFG10K · Jun 11, 2016

kraatus77 said:
this, i have already said many times, ipc regression or even same ipc isn't a bad thing as long as you are increasing #cores and or frequency. the end result is what matters.

Agreed, as long as we also factor power consumption, and the cooler by extension.

While Pascal does very well with lean power requirements, the reference cooler is relatively poor given it has so little heat to dissipate.

sirmo · Jun 11, 2016

BFG10K said:
Actually it was terribly slow and extremely power-hungry. That's why they completely scrapped it with Conroe and have never looked back.

I was referring to Northwood, which was actually pretty competitive: http://www.anandtech.com/show/866/12

This is at least what Pascal reminds me of.

tviceman · Jun 11, 2016

People get so hung up with paper specs, it's laughably ridiculous. If megahertz gets your e-peen going strong, then focus only on that. I personally like to see that FPS bar on review graphs. It doesn't matter to me how big the die, how high the transistor count, IPC, or how wide the bus. All that matters in the end is how many FPS does it produce.

If people are hung up over whether or not Pascal has improved IPC over Maxwell, those people are a waste of their own time.

sandorski · Jun 12, 2016

The only thing that really matters is the end result, Performance and to a lesser extent Power Consumption. If higher Clock Speed product overcomes lower IPC and has similar Performance/Power Consumption as a higher IPC/Lower Clockspeed product, it doesn't matter. There could be some issue regarding the future products based on that choice, like some kind of hard Clockspeed upper limit(something AMD CPU's currently suffer from), but until that happens it is a non-Issue.

Borealis7 · Jun 13, 2016

so a good, objective, game-independent, measure of a GPU is purely flops? or flops/W?
maybe the old pixel fill rate or triangles per second?

we can't be reduced to comparing GPUs solely on performance in games because of proprietary software that some games run on which is better/worse on certain GPUs of a certain company.

Piroko · Jun 13, 2016

Decent post @ op, but you lost me at

renderstate said:
... IPC is almost always inversely proportional to frequency.* ...

That is not true. Inverse proportionality would imply that you have infinite IPC at 0 Hz.

There is a drop off in IPC if your GPU is starved for memory bandwith, but there is no linear correlation because modern GPUs have a decent amount of L1/L2 cache among other factors.

Also, measuring two architectures at equal frequency may give you a value of IPC, but it's not actually representing the performance that you see when using the chip. The decrease in IPC with higher clocks is not something that you as a customer can compensate in any meaningful way, it is part of the product that you buy.

redzo · Jun 13, 2016

Bulldozer did some serious damage. I guess that since bulldozer, people associate minimum to zero IPC gains with bad products.

TheELF · Jun 13, 2016

Borealis7 said:
so a good, objective, game-independent, measure of a GPU is purely flops? or flops/W?
maybe the old pixel fill rate or triangles per second?

we can't be reduced to comparing GPUs solely on performance in games because of proprietary software that some games run on which is better/worse on certain GPUs of a certain company.

You still need a lot of pixel fill rate and triangles for modern games.
If you want to get a GPU for gaming comparing GPUs solely on performance in games is the only way to go.
If you need GPGPU then sure go for flops or whatever else rocks your boat.

Nvidia saw that games tend to use more shader cores now that don't need to be as fast as previously so they tuned their new cards accordingly and you can still o/c to get the cores higher if you need to.

renderstate · Jun 13, 2016

Piroko said:
Decent post @ op, but you lost me at

That is not true. Inverse proportionality would imply that you have infinite IPC at 0 Hz.

The IPC curve is not going to be inversely proportional to frequency at low frequencies where the effect of memories is irrelevant. That's pretty straightforward. It's also a completely uninteresting case.

There is a drop off in IPC if your GPU is starved for memory bandwith, but there is no linear correlation because modern GPUs have a decent amount of L1/L2 cache among other factors.

Agreed and no one wrote there is a linear correlation.

Also, measuring two architectures at equal frequency may give you a value of IPC, but it's not actually representing the performance that you see when using the chip. The decrease in IPC with higher clocks is not something that you as a customer can compensate in any meaningful way, it is part of the product that you buy.

I agree but I wasn't trying to say we should be compensate for different frequency designs. My point was simply to highlight the fact people use IPC incorrectly on this forum.

renderstate · Jun 13, 2016

Borealis7 said:
so a good, objective, game-independent, measure of a GPU is purely flops? or flops/W?
maybe the old pixel fill rate or triangles per second?

we can't be reduced to comparing GPUs solely on performance in games because of proprietary software that some games run on which is better/worse on certain GPUs of a certain company.

All it matters is the final experience. A wide benchmark made of lots of games is much better than synthetics tests if you want to have an overall picture. Synthetic tests are still interesting when measuring architectural details.

Broburger · Jun 13, 2016

tviceman said:
People get so hung up with paper specs, it's laughably ridiculous. If megahertz gets your e-peen going strong, then focus only on that. I personally like to see that FPS bar on review graphs. It doesn't matter to me how big the die, how high the transistor count, IPC, or how wide the bus. All that matters in the end is how many FPS does it produce.

If people are hung up over whether or not Pascal has improved IPC over Maxwell, those people are a waste of their own time.

:thumbsup:

provost · Jun 13, 2016

Borealis7 said:
so a good, objective, game-independent, measure of a GPU is purely flops? or flops/W?
maybe the old pixel fill rate or triangles per second?

we can't be reduced to comparing GPUs solely on performance in games because of proprietary software that some games run on which is better/worse on certain GPUs of a certain company.

Yep, I agree about the proprietary part. Although I am not an expert in microprocessors, I would still like to understand where my gpu hardware's performance is coming from rather than accepting that it's some kind of black box model that can be easily manipulated through proprietary software. I am sure that from a marketing perspective, it's easier to just keep pitching "performance", but digging deeper may give us a clue about the longevity of the "performance" being rented. I would prefer that hardware companies stay out of middling with "performance" through proprietary APIs and drivers, but that's a different discussion... Lol

tential · Jun 13, 2016

Pascal scales worse than Maxwell for a similar mhz bump in OC. Pascal OCs worse than Maxwell.

The whole point people were trying to make is that this isn't the same as the Maxwell generation where Nvidia's OC performance was honestly glorious. OCing doesn't take you as far as it used to between Pascal and MAxwell.

dacostafilipe · Jun 13, 2016

Am I missing something? Why should the IPC change with the frequency? It's called "Instructions per clock" for a reason.

Yes, you have bottlenecks and underutilisation problems but that's why IPC is normally calculated over a range of frequencies.

Two misconceptions about IPC and GPUs

Senior member

Diamond Member

Senior member

Diamond Member

Golden Member

Lifer

Golden Member

Elite Member

Senior member

Golden Member

Lifer

Lifer

Golden Member

Diamond Member

No Lifer

Platinum Member

Senior member

Senior member

Diamond Member

Senior member

Senior member

Senior member

Member

Diamond Member

Senior member