[Tom's] Normalized single-core CPU performance at 3GHz

Page 4 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

magomago

Lifer
Sep 28, 2002
10,973
14
76
It is not relevant to any purchasing decisions any human should make.

When people talk about how we've hit diminishing returns in per-core IPC, this just shows that we indeed have, and how much designing CPUs with more cores working together in mind affects performance, today.

Thank you.


As can be seen perfectly fine from the Athlon vs. Phenom example most of those CPUs aren't hampered by too few cache. Now a smaller, faster cache may get you better results while for multi threaded programs this may be different, but for most programs the difference will be small enough - also not much one can do about that.

Also they list the times for every single program so I don't see your problem - sure adding up times together has some obvious flaws but it's not as if you didn't have the raw numbers to normalize them and make a more useful summary - but then it's not as if there were any surprising outlier that would seriously distort the data.

And why IPC is important? Well because apart from some classes of problems (encoding, raytracing,.. - basically anything for which CUDA programs exist by now :p ) many, many algorithms don't scale well to more than maybe a dozen or two threads. Not to forget that even for perfectly parallel problems, there are many real world reasons why you still need a good baseline performance and not hundreds of extremely weak cores. There are enough papers from google and MS on that topic. Also neither AMD nor Intel have any idea how to scale beyond a few dozen cores at best with their current architectures, which will be a much more interesting problem than who has two cores more or less right now (the advance of NUMA for PCs? that'll give developers some headaches)

Also if you look at the usual game you'll see that even for only four cores the work isn't equally distributed and that'll only get worse with more and more cores.


They do? In all tests I looked at the difference is minimal and easily below the margin of error.


See this is where I still disagree.

Let us assume there is a worthy endeavor in measuring single core performance.

Explain how this test does a good job of that. How does measuring the times of a bunch of synthetic tests, and adding them up, give us actual insight into the performance? Of course you'll get something out of it - new processors have faster single core performance...but trying to draw anything more than that vague conclusion is wrought with a load of problems because the data isn't meaningful in any way. How should we interpret something that is X seconds faster than another? There isn't any real academic endeavor in this because it would be ripped to shreds by any analyis of what these pooled values mean.

I didn't say IPC was important, I didn't say that everything was a highly threaded environment at all...hell by your own admission you can make big gains up to "a dozen or two threads" (which I have no idea honestly).

All I'm saying is this: the metrics mean little to nothing. Even if there are no outliers or something that skews the results, it is still pretty meaningless

A better test would be to see how each computationally intensive program performs and thinking behind the logic of each program used for the test and what is being stressed to explain performance differences...and then I'd take a step back and think if it made more sense to really just test the entire CPU rather than a crippled piece of the whole pie. Sometimes it makes more sense to do system level testing...and in this sense, taking the whole processor as the system ultimately does make more sense.

Anyways we'll probably agree to disagree, so cheerio!
 

lol123

Member
May 18, 2011
162
0
0
that makes no sense at all. the fact is games use more than one core and many can completely max two cores and beyond. I dont care how much IPC you have, a single core of any cpu is not enough to run most modern games properly.
I hate to be rude, but it would make sense to you if you knew more about game programming. As I said, games now make use of more than one thread (mostly for features that have been laid on top of how game engines used to be made, like advanced physics) but the performance of the application is still limited by how fast the main thread can be executed. That means that a 4 core processor running at 3.2 GHz will be faster for gaming than a 16 core processor of the same architecture running at 1.6 GHz, and it will probably always remain so.
 

mhahnheuser

Member
Dec 25, 2005
81
0
0
...IMO it is not possible to compare previous architectures as the software is developed to take advantage of the hardware features provided by the manufacturers. Little point in looking at single core throughput and very misleading as there could be very negative impact on shared cache memory.

Core 2's cache is fully shared between the two CPU's so 1 cpu has access to the same cache as when both cores are utilised, while the AMD CPU's would have CPU disabled and no access to the cache available from the disabled CPU.

The real point has been missed...and that is because the Phenom/Athlon 64's all variants already have the on-board mem controller, true cores and handle 64 bit instruction.. as software now goes in this direction these processors will benefit from software written for i series, SNB, IB the AMD stable will be able to take advantage of this where as P4 and Core 2 will simply get slower and hit dust bins at an increasingly faster rate.

...certainly explains the uber-excitement of Intel for SNB...as for me...I am running out of excuses to upgrade, so I hope that Bulldozer doesn't force my hand by being too superior to Deneb/Thurban....but of course if it offers advancements and additional refinements over and above these processors, and not simply speed, I guess I'll be lining up again.

Remember this...Intel is the reason for Win 7.0, AMD was ready and awaiting Vista 64.
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
I said, games now make use of more than one thread (mostly for features that have been laid on top of how game engines used to be made, like advanced physics) but the performance of the application is still limited by how fast the main thread can be executed.

That's where I was going, that's why I mentioned three heavily utilized threads. Which went right over some heads.

Sure the main thread is is heavily utilized, and maybe an AI thread or a physics thread (which would be frame rate limited by the main thread anyway). But I don't see sound, mouse / keyboard polling etc. being CPU intensive threads.
 

Voo

Golden Member
Feb 27, 2009
1,684
0
76
How does measuring the times of a bunch of synthetic tests, and adding them up, give us actual insight into the performance?
Huh? How is compressing a file a synthetic benchmark? Video/audio encoding? Encryption? Transcoding? The pure synthetic benchmarks are quite in the minority and can safely be ignored if you think they don't add anything valuable (which I quite agree with).

Sure there are some things that skew the results a bit, but they can be accounted for and nobody is saying that those tests have to be 100% correct. They just give strong indicators about the IPCs of the different µarchs and that is important - or does anyone here think we'll see games that fully utilize 6 or even 8 cores in the near future? Note that I'm saying fully utilize - there's a big difference between having 8 threads and those 8 threads doing approximately the same amount of work.

mhahnheuser said:
Core 2's cache is fully shared between the two CPU's so 1 cpu has access to the same cache as when both cores are utilised, while the AMD CPU's would have CPU disabled and no access to the cache available from the disabled CPU.
Uh what? The l1 cache of core 2's isn't shared between cores and the l3 cache of Phenoms is shared between all cores as well (and therefore resembles the l2 cache of the core2s much more). Sure the Phenoms cache behavior was optimized for its 4 or 6 cores so that skews the results a bit (a smaller, faster cache has its advantages in this case) - but then the same is true for the newer intel architectures and it doesn't seem to matter much.
 

Termie

Diamond Member
Aug 17, 2005
7,949
48
91
www.techbuyersguru.com
I thought I'd lighten up this discussion with a brief musical interlude courtesy of Weird Al: It's All About the Pentiums - http://www.youtube.com/watch?v=qpMvS1Q1sos

A time machine back to a day when the Pentium was king. Weird Al nails it, as usual...I wonder if he rolls his own.

Best lines:

"I got me a hundred gigabytes of RAM
I never feed trolls and I don't read spam"

(he was way ahead of his time with that RAM!)

"Upgrade my system at least twice a day
I'm strictly plug-and-play, I ain't afraid of Y2K"

"You're usin' a 286? Don't make me laugh
Your Windows boots up in what, a day and a half?
You could back up your whole hard drive on a floppy diskette
You're the biggest joke on the Internet"

And best of all:

"My new computer's got the clocks, it rocks
But it was obsolete before I opened the box
You say you've had your desktop for over a week?
Throw that junk away, man, it's an antique
"

An anthem for generations of AnandTech forum-goers.
 
Last edited:

Lorne

Senior member
Feb 5, 2001
874
1
76
I couldnt find the page that explains What they did to zero bias the differences between memory differences and the bus between them on the built in MMU, Not to mention how to compensate for the loss of data bandwidth on the P4 through the Core 2 since the NB shares memory and gfx share the bus, There for scuing actual core resaults.

Did they actually set all busses down to the lowest common, Same with memory?
 

jaguare

Junior Member
Jul 5, 2011
14
0
0
Tbh that's not really true anymore...technology has sloooowed down.

I still got my Core 2 Duo and it's just fine for me
 

Termie

Diamond Member
Aug 17, 2005
7,949
48
91
www.techbuyersguru.com
I couldnt find the page that explains What they did to zero bias the differences between memory differences and the bus between them on the built in MMU, Not to mention how to compensate for the loss of data bandwidth on the P4 through the Core 2 since the NB shares memory and gfx share the bus, There for scuing actual core resaults.

Did they actually set all busses down to the lowest common, Same with memory?

On Core 2 at least, I'm sure all they did was drop the multiplier to get to 3.0. I assume they could do the same on P4 (although mine had a locked Dell bios, so I really wouldn't know).
 

Lorne

Senior member
Feb 5, 2001
874
1
76
Its a bogus test, To many variants for clock per clock test, Like comparing apple, oranges and peaches with all the motherboard differences.
 

podspi

Golden Member
Jan 11, 2011
1,965
71
91
Its a bogus test, To many variants for clock per clock test, Like comparing apple, oranges and peaches with all the motherboard differences.

I don't disagree, but to get an idea of overall performance difference, I think it is interesting and useful. Kudos to Toms for doing it. :thumbsup:
 

Makaveli

Diamond Member
Feb 8, 2002
4,718
1,054
136
If it's unquantifiable, then it falls under the heading of "faith". So you AMD true believers get together, and keep the faith alive.

I agree Larry,

This may have been true in the K8 days when AMD was first to have a IMC.

My 939 system is now my HTPC so i've felt this smoothness you speak of, but that was all lost when Nehalem came out which also has an improved IMC.
 

Concillian

Diamond Member
May 26, 2004
3,751
8
81
LMAO! All games from 2009 to now make use of at least three threads and enterprise applications are almost always the first to take advantage of multi-threading since they'll get workloads done quicker, and more efficiently.

Have you been living under a rock or what? Single-threaded applications are now pretty much limited to only audio encoding.

Here is a quote I've kept around for explaining this to people. This is from a curious newbie:

I am running a i5-750 processor and when I'm playing games such as BF:BC2 or Dragon Age i observe my windows 7 CPU Usage gauge and cores 1-3 are all between 15-40% usage fluctuating back and forth, but core 4 is always hovering around 95-100% usage. is this normal or do i appear to have a faulty processor?

Even on games that use multiple cores, the performance of ONE core is the primary CPU limitation. Single threaded performance is still very important in gaming, even multithreaded games.

You ever look at your core usage while gaming? one core pegged, other threads are generally easily handled with half a core.

It would be interesting if you could overclock just one core of a 2500k to see how much different gaming benchmarks would look with 3 cores stock and one core at 4.5 GHz vs. all 4 at 4.5 GHz.
 
Last edited:

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
If it's unquantifiable, then it falls under the heading of "faith". So you AMD true believers get together, and keep the faith alive.
Not necessarily. See BFS flamefests, and some mysterious events after (miraculously, some long-standing freezing bugs, which affected one of my PCs, got fixed in the mainline scheduler, FI, after years of feet-dragging :)), for recent examples.

Some things are difficult to quantitatively measure, and user responsiveness is one of those things. One major problem with measuring it is that you can't tie yourself down to single variables. MS has specific tweaks for classes of CPUs in Windows, too, so it is quite possible that there could be different behaviors between CPUs (IE, if said differences exist, I'd bet it's not the hardware). I can't say that I've noticed any such thing, though, as changing OSes, and in Linux, kernel versions and variants, far overshadow any minor differences there may be, or have been. Just playing devil's advocate, though, because I know from experience that it can happen.

You ever look at your core usage while gaming? one core pegged, other threads are generally easily handled with half a core.
With only two, I commonly see both pegged :\.

Even so, 15-40% for each other core would still need in excess of a 50% boost from a single core to match adding cores, assuming those threads' CPU times can be correlated to performance gains. Currently, turbo will not get close to that sort of gain, so more cores still makes sense.

When we are at the point of regularly using 8+ cores, they won't be at 100%. Just like using wide-issue CPUs to get ~1 typical IPC, the benefit is and will be having free resources to use while others are busy, and linear performance scaling, or even remotely close, is going to remain very rare. Once software catches up, and performance increases are much better (see Dirt 3, as a recent interesting example), it will plateau, and then we'll be back to squeezing out minor gains bit by bit, again (the advantage will be that programs not written for performance first will be able to take advantage of many cores easier than today, just as it's easier today than it was 5 years ago).

It would be interesting if you could overclock just one core of a 2500k to see how much different gaming benchmarks would look with 3 cores stock and one core at 4.5 GHz vs. all 4 at 4.5 GHz.
Yes, it would.