That's a funny statement to make. IPC in isolation is useless: a poorly optimized application might run many useless instructions and that might artificially increase IPC.
Nevertheless, if you run the application you execute such instructions, and they should be counted as well, since they impact on the performance, and thus on the time that the application takes to complete the task.
One place where IPC can be considered as useful is for comparing two different CPU running the same program (or when tuning a micro-architecture

)
Also. But it remains a metric for measuring the performance of the application.
With a clear distinction it should be possible to use IPC for 1T on a 2T core.
Where is the definition of IPC, which excludes its application to parts of programs, or different scenarios on a SMT machine? This would just cut that metrics' usability. In fact I've read papers showing the actual IPC plotted over time for different applications.
Nobody stops you on doing it: IPC is a generic measure, so you can apply to portions of an application, as you reported. You only need to measure the number of cycles taken by a certain number of executed (retired) instructions, and that's it.
If the quote above is a definition of sorts. then I'm an interstellar rocket.
In fact it isn't a definition. The definition was in the link that I gave:
"Instructions Retired per Cycle, or IPC shows average number of retired instructions per cycle."
I reported the other sentence to show that the IPC is about measuring the performance of an application.
Based on your post, would you agree that IPC is utterly useless in describing ST performance of a SMT capable CPU core?
Maybe because IPC is related to the overall performance of an application, and not to theoretical numbers which means nothing by themselves?
As I reported above, the IPC is... just an average of the number of cycles per instructions spent by... running an application. If the application is ST or MT, it doesn't matter looking at the IPC.
It's clear that, with the latter and with an SMT-capable core, the IPC is affected by the contributes of both hardware threads which are running and concurrently using the available shared resources.
You got me to check the literature.
And there I found this:
Which shows that IPC isn't related to single-threaded (because it's in contrast with "
the case for single-threaded workloads").
BTW, IPC as the reciprocal of CPI (part of the "Iron Law of Performance") is described in the chapter "Single-threaded Workloads".
The fact that the IPC is described in a chapter which such name doesn't mean that the IPC is a single-thread measure. Logic at the hand.
In fact, the sentence that you reported from the text, states the exact contrary.
EDIT: I found the cited paper leading to this remark.
Source: A. R. Alameldeen (Intel) and D. A.Wood (University of Wisconsin-Madison). IPC considered harmful for multiprocessor workloads.
IEEE Micro, 26(4):8–17, July 2006.
EDIT#2: IBM researchers distinguished between single-threaded IPC and SMT IPC here
http://pharm.ece.wisc.edu/wddd/2002/final/squillante.pdf
From p.5:
"The primary performance measures presented are the average number of instructions executed per cycle (IPC) and the miss ratios of all caches. We compute IPC and other statistics for SMT simulations and compare to singlethreaded performance as follows."
Here IPC isn't related to ST performance, since such result is compared with ST performance.
It's also not true that researches distinguish between ST-IPC and SMT-IP: this is the artificial work of splitting instructions execution that they did to have numbers for ST and MT while running an SMT application.
Again from p.5:
"In SMT mode our simulator halts when one trace runs out of instructions. Thus our SMT measurements reflect only multi-threaded performance. We record the position in the trace of the second thread, the one that did not run to completion, and extract the statistics for that initial portion of the trace from a single threaded run. The combined single-threaded IPC of the two traces is then computed by adding the total number of instructions executed and dividing by the total number of cycles on the two single-threaded runs (one complete and one partial). Thus we are comparing single-threaded and multi-threaded performance on exactly the same set of instructions."
From p.7:
"Selfishness is the relative speed of an application when running in SMT mode as measured by its IPC as a percentage of its single-threaded IPC."
Here an MT applications' IPC is, again, related to... SMT. Which is obvious.
From p.9:
"A partial explanation for this difference may be their 8-thread SMT vs our 2-thread: IPC results were only given for 8 threads, and miss rates for 8 threads show a much greater difference between COLOR and BINHOP than for 2 threads."
Why IPC results weren't given for less threads? 8 threads were available, and, guess what... they reported the results with all 8.
With the IPC definition which is:
"Average number of useful instructions executed per cycle"
Wow! Incredibile
And another interesting thing on p.10:
"We prefer the use of two metrics, one for fairness and one for throughput (IPC)."
I was merely attempting a little reduction to absurdity, but I guess more solid conventional knowledge will also do the trick, with less OT to boot.
The conventional knowledge doesn't seems to be according to you. See above what I've reported by the same sources.
Now I wait your reduction to absurdity, but possible without just empty words, eh!