Hyperthreading questions

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

cockeyed

Senior member
Dec 8, 2000
777
0
0
Originally posted by: IntelUser2000
--snip---

And cockeyed, 2-4% is enough to disable hyperthreading? It seems that thanks to Prescott's performance and its heat reputation causing bad reputation to ENTIRE line of Pentium 4's(more than before), people see it more negative than they did before.
---snip----
Most of you disable HT because of 2-4% on select applications? That is just plain ridiculous.

My issue is not with the HT technology, it is with the extra heat generated when it is enabled in Prescott CPU's. I had a P4-2.6c that had HT enabled and it ran nice and cool. My 3.0E idles at 38c wo/HT enabled and and at 48c w/HT enabled. The thing is, that I like a quiet machine and the extra heat from HT causes my temp controlled fans to run faster making more noise. Considering that in my use, I don't get any benefit from HT, I'm better off not using it.
 

LTC8K6

Lifer
Mar 10, 2004
28,520
1,576
126
My 3.0E shows little to no temp difference with HT on or off. 1 or 2 degrees C.

HT should not raise your temp that much. HT doesn't do anything at idle anyway. I would investigate your cooler and thermal compound installation or maybe even your BIOS.

When you have HT on, do you see 2 processors in device manager?
 

n7

Elite Member
Jan 4, 2004
21,281
4
81
Originally posted by: n7
Hyperthreading is a whole lotta hype.

Very few people will ever notice the difference between an HT CPU or a non-HT CPU, since they won't be using applications that really tax the CPU.

The only time i would like to have HT is when i am encoding video, since it tends to suck up 100% of my CPU & working on other things at the same time could interfere with the encoding process.

/my $0.02

Since some of you are freaking out on what i said, i will repeat it, since aside from seriously heavy users like us, very few people could tell the difference.

I guess i was more referring to my typical customers who come in to buy a P4 with HT because they want to play music & surf the internet at the same time, & they know they need HT to do that.
Sadly, those people are buying based on hype, not actual knowledge, since HT will never help them out with their minimal needs.

I am well aware the HT can make a significant difference, but only to the people who are seriously utilizing their CPU, something 90% of the population won't be.

Here on Anandtech, most people will get their money's worth outta HT, but we aren't exactly the typical users...

 

cockeyed

Senior member
Dec 8, 2000
777
0
0
Originally posted by: LTC8K6
My 3.0E shows little to no temp difference with HT on or off. 1 or 2 degrees C.

HT should not raise your temp that much. HT doesn't do anything at idle anyway. I would investigate your cooler and thermal compound installation or maybe even your BIOS.

When you have HT on, do you see 2 processors in device manager?

I thought it was odd that HT raised the temp that much at idle. Under load, the CPU is about the same temp with HT on or off. With HT on, I do see 2 CPU's in Device Mgr., 1 with HT off. I changed the stock HSF to a Zalman 7000B AlCu (nice HSF) and I got lower temps, but the idle temp still increases when HT is enabled.

I had run Prime95 in dual instance mode to see how hot the CPU would get under load and I was starting to think that Prime95 caused a problem in the HT idle circuits. I don't recall seeing high idle temps with HT on before I used Prime95. I tried cleaning out all traces of Prime95 from the files and registry but it didn't help. Any ideas would be welcome.

The highest temp reached in Prime95 was 52c in dual instance mode (100% cpu usage)
 

DAPUNISHER

Super Moderator CPU Forum Mod and Elite Member
Super Moderator
Aug 22, 2001
32,095
32,641
146
The question about if there is a lot of hype in hyperthreading is a consistant source of controversy here. In the end it seems it is more a boon than burden to the netburst architecture but only really shines in siuational multitasking or when optimized for. I recently built a P4C-Intel D865GBFL-160 8mb SATA-DDR400 dual channel system for someone and due to the xtreme2 IGP being in use the system did not impress me at all. The responsiveness/snappiness was lacking compared to Barton+nF2 IGP and the IGP performance was comparable to nF2 IGP in single channel, just horrible. Running 1 instance of P95 blended TT+3DM'01 resulted in momentary freezing every few seconds during the 3DM tests, turn off P95 and 3DM runs smooth but sloooow; 16mb frame buffer-64mb aperture and burn-in performance mode. Being out of the Intel scene for so long, I was uncertain how to enable PAT so that may have had an impact as well, others will have to comment on that.

 

VirtualLarry

No Lifer
Aug 25, 2001
56,587
10,225
126
If that was an Intel mobo, with an i865 chipset, then I don't think that there is a way to enable PAT, period, since Intel wanted it forcibly disabled on i865s. Some vendors found a workaround by fudging with the clocks and the "NBStrap" settings on the chipset, apparently Intel's lockout wasn't 100%. But I doubt that an Intel-made mobo would include either BIOS or jumper settings to fudge with the NBStrap settings.

As far as HT-caused CPU resource-contention, think of it this way. Take the example of a 1Ghz Pentium-III chip, with 256KB L2. Now, let's assume that you can interleave two tasks on the same chip, on alternate operations, but that they would have the entire chip's resources (including the L2) available to the task in question exclusively. Now, of course, this isn't possible in the real-world, but go with me, it's an illustration. That would give theoretical performance, for each tasks, of being equivalent to each task running on its own CPU with full 256KB of L2, but each CPU running at half of the clockspeed. Meaning, two 500Mhz P3s with 256KB L2, each running one task. Ingnoring real-world SMP issues, that would yield a hypothetical performance equivalent to one 1Ghz CPU running two tasks on alternate opcodes. This is the absolute best-case scenario, but impossible in the real-world, which is why SMP systems don't "add the Mhz".

Now let's take the case of HT-enabled P4s, when HT is enabled. Not only do both tasks have to share the same CPU Mhz, they also have to share certain limited resources that I mentioned above, which are similar to splitting and sharing the L2 cache in our hypothetical P3 example.

So in reality, it would be analogous to two tasks, each running on a 500Mhz P3 Celeron, with only 128KB L2 for each task to use. Some tasks perform largely the same, whether running on a real P3 with 256KB L2, vs. on a P3 Celly with 128KB L2. Other tasks, absolutely tank in performance, because they start to thrash the L2 cache.

So for those sorts of tasks, running on a P4 system with HT enabled, there can indeed be a significant loss in performance, over running on a non-HT-enabled P4.

For those that would claim that there is essentially never a performance hit for enabling HT on the P4, would you also make the claim that there is essentially no performance difference, replacing a 1Ghz P3 with a 1Ghz P3 Celeron with half the L2? Or would you be willing to admit that some programs will indeed suffer in performance with that switch?
 

Accord99

Platinum Member
Jul 2, 2001
2,259
172
106
Originally posted by: VirtualLarry
So in reality, it would be analogous to two tasks, each running on a 500Mhz P3 Celeron, with only 128KB L2 for each task to use. Some tasks perform largely the same, whether running on a real P3 with 256KB L2, vs. on a P3 Celly with 128KB L2. Other tasks, absolutely tank in performance, because they start to thrash the L2 cache.
Northwood has 512KB of L2 cache and performs OK with 256KB. The difference in hit rates between the two cache sizes is probably high 90% to mid 90%. The throughput increase of the better utilization of execution resources more than makes up for the increase in cache misses. And when running only one CPU intensive application, then the CPU does not need to share any longer.

For those that would claim that there is essentially never a performance hit for enabling HT on the P4, would you also make the claim that there is essentially no performance difference, replacing a 1Ghz P3 with a 1Ghz P3 Celeron with half the L2? Or would you be willing to admit that some programs will indeed suffer in performance with that switch?
No, but there are very few that do. In fact the only real app that I know of is dnet's RC5-72 distributed application and that has more to do with its reliance on certain key instructions than cache thrashing. And you can avoid by only running one instance of it. Nearly all commonly used CPU intensive applications, ranging from Lame, SuperPI, PiFast, Seti, F@H, ScienceMark, zipping applications, etc see gains in the 15-30% range when two or more instances are running simultaneously.
 

VirtualLarry

No Lifer
Aug 25, 2001
56,587
10,225
126
Originally posted by: Accord99
No, but there are very few that do. In fact the only real app that I know of is dnet's RC5-72 distributed application and that has more to do with its reliance on certain key instructions than cache thrashing. And you can avoid by only running one instance of it. Nearly all commonly used CPU intensive applications, ranging from Lame, SuperPI, PiFast, Seti, F@H, ScienceMark, zipping applications, etc see gains in the 15-30% range when two or more instances are running simultaneously.
Like I suggested earlier, those sorts of applications are really nearly the only ones that show any significant advantage with HT enabled. Those gains are highly atypical for common, ordinary apps. In other words, those are really almost the only sorts of apps that benefit from HT.

My analogy earlier was simply to point out that with HT enabled, not only do both threads have to share the clock-speed, but they have to share the CPU-resources too, which results in each individual thread getting even less than half of the CPU's resources, compared to if it was running as a single task with HT disabled. Some trivial office-type apps are not very affected; those are the ones that would probably run fine on a cut-down Celeron CPU too. But other more CPU-intensive apps, like games, that are not in the same category as the distributed-computing-style apps mentioned above, will actually suffer in performance with HT enabled, as was documented in another post.

So, IMHO, HT is not always a "sure win". I usually disable it, since I do tend to run many those sorts of CPU-intensive apps that it performs poorly with. I'm not denying that there are some scenarios that it does offer a measurable performance increase in, but they tend to be rather specialized, just as was the case with Intel's MMX. For most apps, it did absolutely squat, and Intel hyped it to no end as if it would accelerate everything, which was rather quite far from the truth, outside of a very limited set of specific MMX/SIMD-optimized programs. So to attempt to paint a broad performance-enhancing picture about HT in the same way that Intel's marketing literature does about HT, is hype.