Why are more slower cores better than one fast core?

Page 4 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

billyevans

Member
Apr 7, 2013
48
0
0
As I understand it, single cores hitting the same clocks as multi-cores would be better in multiple aspects, but would be difficult (impossible?) to currently produce.
 

R0H1T

Platinum Member
Jan 12, 2013
2,582
163
106
As I understand it, single cores hitting the same clocks as multi-cores would be better in multiple aspects, but would be difficult (impossible?) to currently produce.
Not so much as difficult or impossible to produce but they'll be highly inefficient with a multitude of tasks that can be performed with ease on modern multi-core processors !
 

Exophase

Diamond Member
Apr 19, 2012
4,439
9
81
Yes but then what?

Thread1 -> core1 main thread
Thread2 -> core2 main thread
Thread3 -> core3 main thread
Thread4 -> core4 main thread
Thread5 -> core1 hyper thread

Now it just so happens that Thread5 is my most important thread of them all and now its chugging away at 20% ..

When I said "before" I didn't mean chronologically but I meant based on how much CPU load the thread needs. OS schedulers are good at tracking that. There's a little bit of time where things aren't in the right place while a thread ramps up because it can't predict the future, but its past record is a pretty good approximation on scales where users start to notice things.
 

aigomorla

CPU, Cases&Cooling Mod PC Gaming Mod Elite Member
Super Moderator
Sep 28, 2005
21,044
3,524
126
well ht is really funny if u look at the mechanics.

its not as fast as a real core.. id say its close to 40% of the net core speed.
however when you have HT loaded.. and main core doing work, id say your main core probably drops to about 80% to give u a net processing speed of 120% (give or take % using hash numbers to give u a concept) if u were compare the work to a single non HT core on a multi threaded app.

However again... HT can hurt you on the flip side... if your doing fetch work on HT, and u need the main core at its full speed, your not going to get it while its hyper threading.
 

cytg111

Lifer
Mar 17, 2008
25,663
15,162
136
When I said "before" I didn't mean chronologically but I meant based on how much CPU load the thread needs. OS schedulers are good at tracking that. There's a little bit of time where things aren't in the right place while a thread ramps up because it can't predict the future, but its past record is a pretty good approximation on scales where users start to notice things.

Sounds good! But then, why do we have apps that suffer from HT? Genuinely asking cause i dont know.
 

Exophase

Diamond Member
Apr 19, 2012
4,439
9
81
Sounds good! But then, why do we have apps that suffer from HT? Genuinely asking cause i dont know.

AFAIK it's not nearly as big of a problem as it was back on Pentium 4.. but the reason it can happen is because there are shared data structures that benefit from temporal locality of reference. Mainly the data and instruction caches but also TLBs and branch prediction buffers (BTB, GHBs, etc).

Consider the following scenario: I'm running two threads on a conventional CPU w/o HT. Call them A and B. Both threads have high activity working sets that fit the L1 dcache size exactly, meaning while they run they access the entire L1 dcache over and over again. What this means is that every time the OS switches from A to B and vice-versa the whole cache has to transition to a different data set. If this switching happens once every 1 microsecond instead of once every 1 millisecond then it's much more expensive. HT acts like it's switching the two threads constantly. So if there's not enough benefit from the HT itself (making fewer dependencies in the code stream, providing a more diverse mix of instructions) the increased competition for the shared buffers can cause a performance degradation. In the worst case you have one thread constantly pushing out the other thread's data and the cache miss rate skyrockets. This is called thrashing.

The situation was worse on Pentium 4 because it had a small L1 dcache, optimized for low latency. This was even more true before Prescott, where it was only 8KB, which may have been why HT wasn't even enabled there. This led CPU architect Andy Glew to want to make two separate L1 dcaches for each thread, but what he found was that if you split them up you had to also split the ALUs and part of the scheduling since it was in the critical path. This line of reasoning was what led to AMD's CMT strategy, although it's probably more extreme than what Andy Glew envisioned.
 

parvadomus

Senior member
Dec 11, 2012
685
14
81
well ht is really funny if u look at the mechanics.

its not as fast as a real core.. id say its close to 40% of the net core speed.
however when you have HT loaded.. and main core doing work, id say your main core probably drops to about 80% to give u a net processing speed of 120% (give or take % using hash numbers to give u a concept) if u were compare the work to a single non HT core on a multi threaded app.

However again... HT can hurt you on the flip side... if your doing fetch work on HT, and u need the main core at its full speed, your not going to get it while its hyper threading.

What is next to 40% of core speed???
When HT is ON there is only one core tied to 2 threads. There is not a physical core and a logical core. Its just one physical core that attends both threads without the need to context switch via OS. This is because it can keep both threads contexts thanks to having a duplicate set of registers.

HT helps to avoid having bubbles at the processor pipeline, if one of both threads is not using a given resource (for example 1 ALU), and the other thread needs it, then it will use it. Its as simple as that.
 

Lorne

Senior member
Feb 5, 2001
873
1
76
HT can be both good and bad, It again goes with how things are programmed, If you look at your scores with games you see little improvement, Look at something like Photoshop and there is a big improvement but something like Lin-X or hard number cruncher and HT can hurt scores by scuing the numbers or over saturating bandwidth to the point of starving cores.

Test can be done with all mentioned programs above by affinity control and clocking the CPU to get results.

BTW, Who ever mentioned 2x12Ghz cpus on a server board, Your missing the point of the thread.

How about Multi-HT? for single high frequency core, Have the OS create as nessesary and keep itself seperate from other threading so that it itself doesnt choke if latency on a single thread skyrockets.