Hyper-Threading: What's the point?

JBAdams101

Junior Member
Dec 15, 2005
9
0
0
My processor is a 3.4 GHz Pentium IV Prescott (800 FSB) with HT available. When I purchased the computer, I rather arbitrarily bought a chipset with HT technology. I still don't even know what that is.

When my motherboard died last week and my hard drive crashed yesterday, I fiddled around in BIOS for a bit and came across a "switch" by which I can alternately turn HT on or off.

Does running a system with HT on generally provide for better performance? What does HT help? What does it hinder?

Also, I am interested in finding out exactly what my processor's speed is with HT on and off. I'd also like to learn about overclocking. Please send me links to articles on those subjects if you have any handy.

Thanks so much.

JA
 

stevty2889

Diamond Member
Dec 13, 2003
7,036
8
81
HT has been around for like 3 years now..but anyway, it's not as good as a dual core, but for multitasking and smp aware applications, it can give a boost, usualy maxes out that boost at around 20%. It's a way to help keep the CPU fed, and keep those long pipelines flowing. With HT off, if you get a stall and have to flush the pipelines, you start all over. If you have a stall with HT on, some things can keep going through, thats basicly the easiest way I can think to describe it. You CPU will be running at the same frequency weather HT is on or off. 3ghz is 3ghz. HT just gives a little performance boost in some cases, and make it feel a bit smoother.

If you look at the top of the CPU forum, there is a good sticked article on overclocking.
 

dmens

Platinum Member
Mar 18, 2005
2,275
965
136
HT is intelspeak for simultaneous multithreading which allows multiple (HT = 2) threads to execute at the same time by either partitioning or duplicating resources and functional units on the CPU. To the OS, a SMT core looks like two seperate logical cores.

Nitpick: stalls don't flush pipelines, and longer pipelines do not necessary benefit more from SMT.
 

BrownTown

Diamond Member
Dec 1, 2005
5,314
1
0
Actually i'd say a longer pipeline should get more benefit from SMP since it issues instructions quicker, but has the same (or often longer) instruction time. So you are likely to ahve more stall cycles since a stall of the equal length waiting for data to become avaiable will cost you more clock cycles.
 

SexyK

Golden Member
Jul 30, 2001
1,343
4
76
Originally posted by: dmens
Nitpick: stalls don't flush pipelines, and longer pipelines do not necessary benefit more from SMT.

Are you sure about the longer pipeline comment? Everything I've ever read has said that HT was implemented on the Netburst cores to help mitigate the deep piplines, while it wasnt on Banias/Dothan because the shorter pipes wouldnt benefit as greatly if at all.... ill try to find at least one link...
 

Gamingphreek

Lifer
Mar 31, 2003
11,679
0
81
Hopefully i can make this a bit easier for you to understand OP. (All these guys are right though)

Intel, as you know, has a long Instruction Pipeline- The instructions that a task must go through to execute. While in this pipeline, you can experience Cache Misses and a bunch of other errors which stops it and resends that instruction. Additionally, since it is SO long one instruction is going through, but the rest of the pipeline is doing absolutely nothing. HT, solves this. It sends 2 threads through the pipeline at a staggered interval. So when Packet A reaches a certain point, Packet B is sent through. This way the pipeline is working as close to its theoretical maximum as possible.

-Kevin
 

Dribble

Platinum Member
Aug 9, 2005
2,076
611
136
Think of it like having a dual core processor with one core being a 3.4 Ghz P4 and the other being a 500 Mhz P4. This *second core* isn't fast enough to do cpu intensive stuff but is great for giving you a more responsive system or allowing you to do a little background processing while you work. On dual core friendly games it can give a little boost (e.g. with the dual core friendly quake 4 patch HT might give you another 10% performance). However for single core oriented games it can actually loose you a percent or two because your cpu's cache (local memory on the processor) has to be shared between the main process (the game) and the HT process.
 

Gamingphreek

Lifer
Mar 31, 2003
11,679
0
81
Originally posted by: Dribble
Think of it like having a dual core processor with one core being a 3.4 Ghz P4 and the other being a 500 Mhz P4. This *second core* isn't fast enough to do cpu intensive stuff but is great for giving you a more responsive system or allowing you to do a little background processing while you work. On dual core friendly games it can give a little boost (e.g. with the dual core friendly quake 4 patch HT might give you another 10% performance). However for single core oriented games it can actually loose you a percent or two because your cpu's cache (local memory on the processor) has to be shared between the main process (the game) and the HT process.

Im not sure i agree with that analogy.
 

dguy6789

Diamond Member
Dec 9, 2002
8,558
3
76
Originally posted by: dmens
HT is intelspeak for simultaneous multithreading which allows multiple (HT = 2) threads to execute at the same time by either partitioning or duplicating resources and functional units on the CPU. To the OS, a SMT core looks like two seperate logical cores.

Nitpick: stalls don't flush pipelines, and longer pipelines do not necessary benefit more from SMT.

Data goes through a pipeline. It goes through various stages to get to the end. If a branch misprediction occurs at any stage in the pipeline, the entire pipeline is emptied and the data has to start all over again. The longer the pipeline, the larger the performance hit when a branch misprediction occurs because the longer the distance the data has to travel to make it through the pipeline. HT helps alleviate this by attempting to keep the pipeline full at all times by sending the data from two threads into the pipes alternating. When one thread branch mispredicts, the other data set can keep going, which helps keep the data moving and keeping the pipeline fully fed. Longer pipelines do necessarily benefit more than short ones. The only reason AMD does not use a varient of HT is because their cpus have much shorter pipelines, and on top of that, very rarely make branch mispredictions compared to Intel processors, thus, the benefit of HT on an AMD processor would not be worth the money and time to implement it.
 

dguy6789

Diamond Member
Dec 9, 2002
8,558
3
76
Originally posted by: Gamingphreek
Originally posted by: Dribble
Think of it like having a dual core processor with one core being a 3.4 Ghz P4 and the other being a 500 Mhz P4. This *second core* isn't fast enough to do cpu intensive stuff but is great for giving you a more responsive system or allowing you to do a little background processing while you work. On dual core friendly games it can give a little boost (e.g. with the dual core friendly quake 4 patch HT might give you another 10% performance). However for single core oriented games it can actually loose you a percent or two because your cpu's cache (local memory on the processor) has to be shared between the main process (the game) and the HT process.

Im not sure i agree with that analogy.


Performance wise, his argument is spot on. A 3.4Ghz P4 with HT is like having an extra 500Mhz P4 helping out. But, the argument is technically incorrect.
 

BrownTown

Diamond Member
Dec 1, 2005
5,314
1
0
Data goes through a pipeline. It goes through various stages to get to the end. If a branch misprediction occurs at any stage in the pipeline, the entire pipeline is emptied and the data has to start all over again. The longer the pipeline, the larger the performance hit when a branch misprediction occurs because the longer the distance the data has to travel to make it through the pipeline. HT helps alleviate this by attempting to keep the pipeline full at all times by sending the data from two threads into the pipes alternating. When one thread branch mispredicts, the other data set can keep going, which helps keep the data moving and keeping the pipeline fully fed. Longer pipelines do necessarily benefit more than short ones. The only reason AMD does not use a varient of HT is because their cpus have much shorter pipelines, and on top of that, very rarely make branch mispredictions compared to Intel processors, thus, the benefit of HT on an AMD processor would not be worth the money and time to implement it.

no, a pipeline stall doesn't flush everything out of the pipeline, even a branch misprediction will still leave everything that is going before it alone. And also all you are considering here are branch misprediction, but both the P4 and A64 predict correctly ~99% of the time. There are a whole lot of other hazards that are more likely to occur and slow down your pipe. Data dependancies and cache misses are alot more frequent occurances to deal with. When the pipeline is stalled waiting for data from another instruction, or the cache the pipeline isn't flushed, it stalls at that point and a bubble forms. However, when you have SMT (or fine/coarse grained context switching) what you can do is issue instructions from the other thread. But its important to note this doesn't require you to flush everything out from the previous thread, instructions from different threads can even be issued in the same clock cycle if one thread isn't using all the issue slots.

At least thats how its supposed to work, for all I know Intel screwed it up :p
 

carlosd

Senior member
Aug 3, 2004
782
0
0
Originally posted by: dmens
and longer pipelines do not necessary benefit more from SMT.

Overall, these results are consistent with the conventional
wisdom that SMT architectures perform better in
deeper pipelines. The larger number of pipeline stages allows
interleaving multiple simultaneous threads at a finer
granularity than those afforded by shallower pipelines.

http://www.eecs.harvard.edu/~dbrooks/lee2005-wced-pipe.pdf

Researchs from the Division of engineering and applied sciences from Harvard university has more credibility than anything else.
 

thecoolnessrune

Diamond Member
Jun 8, 2005
9,673
583
126
I see a huge performance boost:

1 computer. 2.533Ghz 533FSB P4 no HT

2 computer. 2.8Ghz 800FSB P4 HT

While I realise the 2.8 has 300 more Mhz and 300Mhz more FSB, the HT really does help. My bro can encode and play Runescape (A small online RPG) and play just fine. Encoding on the 2.533Ghz system takes the whole process and I can't play Runescape at all let alone do anything else. So yes, an little as people on these forums like to admit it, before Dual core, HT was a welcome boost to the processor world.