Why are more slower cores better than one fast core?

Page 3 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

cytg111

Lifer
Mar 17, 2008
25,652
15,155
136
...
now if u want me to add a wrench in this works... lets talk about Hyper Threading. :p
Cuz those arent real cores which act like real cores.

- Right!!, how the hell is a scheduler supposed to know which is which? This is a real core, this is not, this one is but being hyperthreaded from another angle so expect lesser performance ? Think I would be very confsued if I was a scheduler :)
 

Exophase

Diamond Member
Apr 19, 2012
4,439
9
81
- Right!!, how the hell is a scheduler supposed to know which is which? This is a real core, this is not, this one is but being hyperthreaded from another angle so expect lesser performance ? Think I would be very confsued if I was a scheduler :)

The CPU status registers tell the scheduler to prefer putting threads on separate physical cores before it puts them on the same physical core. I don't think it's that elaborate.
 

2is

Diamond Member
Apr 8, 2012
4,281
131
106
But then you have the problem that an i7 is not just a quad core and an FX-8350 is not quite an octa core, making it hardly a cut and dry comparison.

i7 is indeed just a quad core. Hyperthreading does not change the number of cores.
 

Exophase

Diamond Member
Apr 19, 2012
4,439
9
81
i7 is indeed just a quad core. Hyperthreading does not change the number of cores.

That's just semantics. Hyper threading changes the performance of running more threads. It's not as large as the performance boost you get from traditional separate physical cores. Neither is the CMT on Piledriver, which is also not like traditional separate physical cores.

If you want to compare them you have to take that into consideration and not just what their respective marketing calls a core.
 

2is

Diamond Member
Apr 8, 2012
4,281
131
106
I'm not referring to marketing, I'm referring to the processors physical nature. i7 is a quad core. Not due to semantics or marketing, but physics.
 

Exophase

Diamond Member
Apr 19, 2012
4,439
9
81
Arguing over the definition of core (which is what you're doing) is semantics. Arguing over the definition of anything is semantics. Calling it physics sounds pretty strange to me.

You could spend all day arguing over when something stops being one core and starts being two cores. Maybe it's two cores as soon as you have separate register files. Maybe when you have separate L1 dcaches, separate ALUs, separate schedulers, separate decoders, separate fetch, etc. Or maybe they don't qualify as separate cores unless their entire cache hierarchy is separate, in which case neither AMD or Intel have truly separate cores at all.

But this argument isn't that productive. What matters is how heavily multithreaded software runs.
 

2is

Diamond Member
Apr 8, 2012
4,281
131
106
intel-core-i7-block-diagram-small.jpg


You can call the argument anything you want, but saying i7 isn't really a quad core is nothing short of wrong.

Threads != Cores
 

Exophase

Diamond Member
Apr 19, 2012
4,439
9
81
You can call the argument anything you want, but saying i7 isn't really a quad core is nothing short of wrong.

Threads != Cores

It's still semantics. Neither Intel nor AMD have authority over what the word core means. I wouldn't call an i7 octa core, but ONLY saying that it's quad core leaves out important information when it comes to its multiprocessing capabilities. Same as ONLY calling an FX-8350 octa core, which AMD is perfectly willing to do but not everyone agrees with. Like I said already, I don't really care what people decide to call it, arguing the definition of a core is a waste of time.
 

2is

Diamond Member
Apr 8, 2012
4,281
131
106
I think AMD blurs the lines a bit. Which is why I'm not debating Exophase remarks on that front. But I don't see how HT which, as I understand it, is essentially a unique way of allowing unused portions of a core to be used on another thread where applicable makes an i7, "not really" a quad core. It's allowing more work to be completed per core, but the number of physical cores is no different in an i7 as it is in an i5 (not including Intel's temporary judgement lapse when the introduced dual core i5's a while back)
 

Exophase

Diamond Member
Apr 19, 2012
4,439
9
81
Even HT requires some duplication of resources beyond what a non-MT core can handle. The point is, if you have no definition for how little or how much duplication is required then you can't say what is and isn't a core. HT blurs the line just like CMT does, even if it doesn't blur at as much. The ideas for CMT actually started as a progression from HT.

What I said was that HT makes it not JUST a quad core, as opposed to "not really a quad core." Those two things come off kind of differently to me...
 

Idontcare

Elite Member
Oct 10, 1999
21,110
59
91
Even HT requires some duplication of resources beyond what a non-MT core can handle. The point is, if you have no definition for how little or how much duplication is required then you can't say what is and isn't a core. HT blurs the line just like CMT does, even if it doesn't blur at as much. The ideas for CMT actually started as a progression from HT.

What I said was that HT makes it not JUST a quad core, as opposed to "not really a quad core." Those two things come off kind of differently to me...

I think you could make the argument that HT is the extreme case of dual-core CMT in which you have created a CMT design where nearly every resource is shared.
 

postmortemIA

Diamond Member
Jul 11, 2006
7,721
40
91
have any of you made threaded software? it is almost impossible to make multiple cores have same utilization if tasks given to them are not exactly the same.

Usual multi threading in desktop apps is one thread for UI, and other worker threads that do GPU/CPU intensive work. Guess what happens when you have only one worker thread? UI is idle, worker loads single core. Now, if that work can be divided to 2 independent pairs, great, you could utilize one more core. But if it can't, your other cores are idle. And that's the catch. Usually you have to do sequential work, as the output of one step is the input to next step.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
59
91
have any of you made threaded software? it is almost impossible to make multiple cores have same utilization if tasks given to them are not exactly the same.

Usual multi threading in desktop apps is one thread for UI, and other worker threads that do GPU/CPU intensive work. Guess what happens when you have only one worker thread? UI is idle, worker loads single core. Now, if that work can be divided to 2 independent pairs, great, you could utilize one more core. But if it can't, your other cores are idle. And that's the catch. Usually you have to do sequential work, as the output of one step is the input to next step.

Amdahl's law, serial code is the Achilles heel. This is what turbo-core/boost is supposed to help with. If only they got more aggressive with the turbo clockspeeds so those serial steps can get done all the faster.
 

cytg111

Lifer
Mar 17, 2008
25,652
15,155
136
The CPU status registers tell the scheduler to prefer putting threads on separate physical cores before it puts them on the same physical core. I don't think it's that elaborate.

Yes but then what?

Thread1 -> core1 main thread
Thread2 -> core2 main thread
Thread3 -> core3 main thread
Thread4 -> core4 main thread
Thread5 -> core1 hyper thread

Now it just so happens that Thread5 is my most important thread of them all and now its chugging away at 20% ..
 

mrle

Member
Mar 27, 2009
33
0
0
Yes but then what?

Thread1 -> core1 main thread
Thread2 -> core2 main thread
Thread3 -> core3 main thread
Thread4 -> core4 main thread
Thread5 -> core1 hyper thread

Now it just so happens that Thread5 is my most important thread of them all and now its chugging away at 20% ..

Then you would probably set Thread5 priority higher than the others to instruct the scheduler that it has to give it preferential treatment when assigning threads to cores.
 

cytg111

Lifer
Mar 17, 2008
25,652
15,155
136
Then you would probably set Thread5 priority higher than the others to instruct the scheduler that it has to give it preferential treatment when assigning threads to cores.

Yes but again that means that I as a programmer need to consider what kind of core my app/thread will be running on .. real thread, hyper thread, CMT ..
 

bronxzv

Senior member
Jun 13, 2011
460
0
71
Yes but then what?

Thread1 -> core1 main thread
Thread2 -> core2 main thread
Thread3 -> core3 main thread
Thread4 -> core4 main thread
Thread5 -> core1 hyper thread

Now it just so happens that Thread5 is my most important thread of them all and now its chugging away at 20% ..

I don't get your 20%, 20% of what ?
 

postmortemIA

Diamond Member
Jul 11, 2006
7,721
40
91
Yes but again that means that I as a programmer need to consider what kind of core my app/thread will be running on .. real thread, hyper thread, CMT ..

If thread 1 is not utilizing 100% cpu, I see no problem. Thread 5 will. Although there are priorities in Windows, they are more of suggestions. RTOS are designed to solve scheduling problems.
 

bronxzv

Senior member
Jun 13, 2011
460
0
71
Then you would probably set Thread5 priority higher than the others to instruct the scheduler that it has to give it preferential treatment when assigning threads to cores.

with hyperthreading on current CPUs (which lack any notion of hardware thread priority) specifying thread priorities will not change anything if there is more ready thread than hardware contexts (like in the example at hand with 5 threads and 8 hardware contexts)

to ensure you have a full core for a thread you must use processor afinity masks
 

cytg111

Lifer
Mar 17, 2008
25,652
15,155
136
20% I pulled out of my .... but supposed to be the performance level you can expect from a hyperthread ..

I know we have some apps/games that actually hurt from HT .. and I am wondering if this is the cause?
 

Mark R

Diamond Member
Oct 9, 1999
8,513
16
81
Yes but then what?
Now it just so happens that Thread5 is my most important thread of them all and now its chugging away at 20% ..

Well, it would be more like Thread 1 - 60% of dedicated, Thread 5 - 60% of dedicated.

In reality, modern OS schedulers periodically shuffle threads between cores, so that all threads should end up with roughly balanced time on shared or dedicated cores.

Further, the latest OSs do recognise which logical cores relate to which physical core, which cache (some CPUs share caches between cores), which RAM (e.g. in multi-socket CPU systems) and can optimize the scheduling of threads, so that a thread which primarily uses RAM on CPU socket B, spends most of its time running on CPU socket B.

A similar technique can be used to bias threads towards unused physical cores (for performance) or bias threads towards partially loaded physical cores (to save power - Microsoft call this "core parking"; if all the running threads can be placed onto 1 physical core without lagging, then all the other cores can be shutdown completely saving power until they are required again).
 

R0H1T

Platinum Member
Jan 12, 2013
2,582
163
106
20% I pulled out of my .... but supposed to be the performance level you can expect from a hyperthread ..

I know we have some apps/games that actually hurt from HT .. and I am wondering if this is the cause?
HT works better when the real cores aren't stressed 100% so in most apps/games when the CPU is below that level HT usually delivers 10% or more absolute gain depending on the apps/games in question ! However on a fully utilized/stressed core it causes thrashing & is counter productive, this is the reason why the original P4 didn;t yield gains comparable to modern Intel processors with HT cause the overall load can be evenly spread across cores & threads more efficiently unlike P4 that had a single core with HT.
 

bronxzv

Senior member
Jun 13, 2011
460
0
71
20% I pulled out of my .... but supposed to be the performance level you can expect from a hyperthread ..

both threads get exactly the same treatment with hyperthreading, so that will be more something like 50%, less than 50% actual if the threads fight heavily for some resources (typically the DL1 and L2 cache) but generally more than 50% since unused execution slots are filled and there is less impact of branch misprediction and cache misses, in other words IPC of each thread is better than half the IPC of a single thread