• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Does win2kpro Support Hyperthreading?

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Because to "update" WIN2K to support HT, they would have to replace the umty-squat zillion DLLs, applications, etc that make up the OS. Yeah sure, there's a kernel, but all the vairous services, processes, etc would have to be be made HT-aware as well. Not worth it, especially when they are looking to sell more Operating Systems.

The programming staff has better things to do (like fix, patch, bug-fight ...)

.02

Scott
(Dual Xeon, XP Pro)
 
Because to "update" WIN2K to support HT, they would have to replace the umty-squat zillion DLLs, applications, etc that make up the OS.

Not really, all they should have to change is the kernel's process scheduler. Userland apps and things shouldn't care at all what kind of scheduler is there.
 
Originally posted by: Nothinman
Because to "update" WIN2K to support HT, they would have to replace the umty-squat zillion DLLs, applications, etc that make up the OS.

Not really, all they should have to change is the kernel's process scheduler. Userland apps and things shouldn't care at all what kind of scheduler is there.

Actually, there are additional api's added to support HT so applications can determine physical/virtual proc's. This was needed to deal with license situations (many apps license per cpu, and the general trend is to mean physical cpu...)

Bill
 
Actually, there are additional api's added to support HT so applications can determine physical/virtual proc's. This was needed to deal with license situations (many apps license per cpu, and the general trend is to mean physical cpu...)

But those are minor, apps can run fine without them. I don't know the APIs to determine the number of CPUs on a regular SMP machine, but I would assume that they would still return the number of physical CPUs and only those apps that actually look for HT Virtual CPUs would know they exist.
 
That certainly coul be true; I haven't been into the guts of an OS for some time. My assumptions were based on the difference between TMPGenc that was not "HT Enabled" versus the version that is (waaaaaaay faster on the HT version).

I figured that if the utilities and services were not specifically HT enabled, then they are just being scheduled as on any other multiprocessor system (the utility or service isn't itself taking advantage of the HT, it just gets assigned to a "processor," real or virtual) to distribute the load.

Again, my (possibly flawed) understanding is that an app must be coded to properly multi-thread / HyperThread, and the WIN2K utility and service processes (assumption) probably are not....relying on the multi-processor kernel to assign/dispatch/schedule to the least-loaded CPU.

For my system (dual Xeon 2G), WINXP, with or without HT enabled in BIOS, got better performance than WIN2K (generally / overall) ... and HT-enabled TMPGenc running on the system with HT enabled was MUCH faster (I didn't try HT-Enabled TMPGenc on WIN2K). The only drawback was the SCSI issue with WINXP, and that has been improved since SP1 (I hear ... I haven't actually checked it).

So, that's why I said what I said. If that's incorrect, then I'll defer to the folks that are paying a litttle closer attention to the issue.

FWIW

Scott
 
That certainly coul be true; I haven't been into the guts of an OS for some time. My assumptions were based on the difference between TMPGenc that was not "HT Enabled" versus the version that is (waaaaaaay faster on the HT version).

TMPGenc is a special case and I'm sure there are a number of cases where a HT-special version of a program will be necessary to take advantage of it, but in the general case the process should work exactly the same on UP, HT and real SMP systems and let the OS handle the scheduling.

I figured that if the utilities and services were not specifically HT enabled, then they are just being scheduled as on any other multiprocessor system (the utility or service isn't itself taking advantage of the HT, it just gets assigned to a "processor," real or virtual) to distribute the load.

That's right, but the problem comes in with HT because the second processor isn't a full processor so anything scheduled on it must be 'special' in that it can only do certain things otherwise it ends up fighting for the real CPUs resources. And if you schedule 2 general purpose processes on the same CPU they just slow each other down with resource contention.

I personally think HT is a bandaid, Intel made the pipelines on the P4 super long so that they could up the clockrate a ton and say they're faster because they're 3Ghz while AMD is only 2Ghz when really the performance different is negligable and in some cases the AMD chips are faster so they came up with HT as a way to fill that super long pipeline and hopefully regain some of the performance they lost going for marketing over technology.

I personally would never pay for a P4 or Xeon, they're not worth it to me. But I can't wait to get my hands on a SMP Opteron board with an AGP slot =)
 
personally think HT is a bandaid, Intel made the pipelines on the P4 super long so that they could up the clockrate a ton and say they're faster because they're 3Ghz while AMD is only 2Ghz when really the performance different is negligable and in some cases the AMD chips are faster so they came up with HT as a way to fill that super long pipeline and hopefully regain some of the performance they lost going for marketing over technology.
I would not really tend to believe it to be a "band-aid" per se, but I do agree that HT really does a good job of making up for the caveots that the loooong P4 pipeline creates. I'm running several P4's and Athlons and with the exception of artificial benchmarks the Athlon PR rating tends to do a decent job of matching the two up (i.e. my P4 running at 2.2Ghz performs about as well as my Athlon 2200+). Of course with any CPU comparison there are advantages/disadvantages to every piece of HW (Memory Bandwidth, Casche, Operations per Clock, etc.) but for daily useage they are fairly comperable. HT may very well give the 3.06Ghz P4 a noteworthy advantage over an Athlon 3000+ if the software is written correctly, that is what interests me about it.

Of course all of this is just my 2 cents and new information could easily change this in the not to distant future.
But I can't wait to get my hands on a SMP Opteron board with an AGP slot =)
I would love to build a dual opteron system also, however I dont have a production need for such a box and cant justify the cost of building a desktop as such 🙁
I guess my dual 1800+ box will have to last me for a while yet in the SMP desktop arena
rolleye.gif


-Spy
 
I guess my dual 1800+ box will have to last me for a while yet in the SMP desktop arena

I currently have a dual 1.2Ghz box, but if I can get a dual Opteron board and CPUs for a half decent price I'm jumping on it.

HT may very well give the 3.06Ghz P4 a noteworthy advantage over an Athlon 3000+ if the software is written correctly, that is what interests me about it.

But that's the thing, can you name once piece of software besides TMPGenc that has been optimized for HT?
 
I currently have a dual 1.2Ghz box, but if I can get a dual Opteron board and CPUs for a half decent price I'm jumping on it.
The problem with a dual Opteron box for my desktop would be that I would end up spending over $800 on just the board and CPUs alone, and of course if I were to go all out on that I would want good storage to go along with it, enter HW RAID-0 SCSI, and of course I would need a nice dual-monitor setup (19+" matching monitors of course) so all of a sudden I find myself spending $2500-$3000 on my desktop which just isnt reasonable to spend, sighhhh
rolleye.gif

But that's the thing, can you name once piece of software besides TMPGenc that has been optimized for HT?
Right now? The OS and that's about it. I'm sure as HT becomes more common vendors will add in HT optimizations. Hey it's kind of like *cough*64bit*cough* applications, just because there arent many 64bit applicaitons now doesnt mean that I wouldnt want an Opteron (vendors will build the applications eventually) 😀
Besides I forsee HT support long before a switch to 64bit for the vast majority of vendors.

-Spy
 
That's right, but the problem comes in with HT because the second processor isn't a full processor so anything scheduled on it must be 'special' in that it can only do certain things otherwise it ends up fighting for the real CPUs resources. And if you schedule 2 general purpose processes on the same CPU they just slow each other down with resource contention.

Anything scheduled on it doesn't have to be special. Any two random threads scheduled on the HT chip will generally be about 20% faster overall than if HT isn't enabled. The problem comes in on dual (or more) systems where you wind up over scheduling a proc and it's HT virtual proc instead of spreading the load correctly accross the multiple physical procs. Thats the underlying issue, in that case while the performance is 20% better with HT enabled, it should be 100% better if properly scheduled.

Also, as to the API question, since 2K doesn't know anything about HT (point of the thread) it returns the number of logical cpu's. The api's added (such as GetLogialProcessorInformation) return (suprise suprise) info on which logical cpu's exist and what physical cpu's they are bound to (today HT suggests a one to one relationship, but that can change in the future).

Bill
 
The problem with a dual Opteron box for my desktop would be that I would end up spending over $800 on just the board and CPUs alone, and of course if I were to go all out on that I would want good storage to go along with it, enter HW RAID-0 SCSI, and of course I would need a nice dual-monitor setup (19+" matching monitors of course) so all of a sudden I find myself spending $2500-$3000 on my desktop which just isnt reasonable to spend, sighhhh

Well I already have 4 SCSI160 drives (no RAID-0 though) and all I need for dual monitors (I already own a 21" and a 20")is a bigger desk...

Hey it's kind of like *cough*64bit*cough* applications, just because there arent many 64bit applicaitons now doesnt mean that I wouldnt want an Opteron (vendors will build the applications eventually)

But I can just rebuild all of my apps and they'll be 64-bit, sure most of them won't take advantage of the added Virtual Memory space, but the added registers and cache would surely help. And since Linux has been on Alpha and other 64-bit systems for years the majority of my apps are already 64-bit clean =)

Besides I forsee HT support long before a switch to 64bit for the vast majority of vendors.

Just like all the highly multi-threaded apps we have now that SMP machines are so easy to get?
 
Anything scheduled on it doesn't have to be special.

I was under the impression that the CPUs resources were shared, so that if one process running on the CPU was using a register the process running on the HT logical CPU couldn't use that register, so if two processes got scheduled and they both want the same register (say they both try to use SSE2 or something) the second one will either block waiting for it or just get rescheduled.

Also, as to the API question, since 2K doesn't know anything about HT (point of the thread) it returns the number of logical cpu's. The api's added (such as GetLogialProcessorInformation) return (suprise suprise) info on which logical cpu's exist and what physical cpu's they are bound to (today HT suggests a one to one relationship, but that can change in the future).

That's true, Win2K sees the HT virtual CPUs as physical so it would return 2 instead of 1, sorry it was late =)
 
Just like all the highly multi-threaded apps we have now that SMP machines are so easy to get?
Well considering it will really only make a big differance on SMP systems anyways you're right in that it probably wont be that common since it doesnt much matter for single threaded apps (although as HT becomes more common I wouldnt be surprised to see more multi-threaded apps out there).
In either case I would expect that Photoshop 8 and Premiere 7 will have support for HT CPUs 😀

-Spy
 
I was under the impression that the CPUs resources were shared, so that if one process running on the CPU was using a register the process running on the HT logical CPU couldn't use that register, so if two processes got scheduled and they both want the same register (say they both try to use SSE2 or something) the second one will either block waiting for it or just get rescheduled.

They are shared and that does introduce delays. That is why you only see a ~20% improvement with two schedulable running threads vs a ~100% improvement you'd see with two physical cpu's.

As a rule 'general' purpose apps work pretty well in this model (most of the time their threads aren't schedulable anyhow). Where this really breaks is applications like IIS and other apps with thread pools where there a multiple scheduled running threads per cpu. Many server apps, for example. allocates threads based on the number of cpu's in the system. There tend to be a nice performance bell curve when you plot out threads vs performance. The app may default to 2-3 threads per cpu (as an example). When they are not HT aware they tend to overallocate since they see (currently) twice as many cpu's. Now while the combined speed of the threads on one physical cpu (as an example) may still reach the 120% of a single cpu, that means each thread is running at about 60%. In this example the app may have 4-6 threads on one physical cpu and may actually lower overall performance by having too many schedulable threads (this is why IIS has problems with HT on 2k).

Bill
 
Just like all the highly multi-threaded apps we have now that SMP machines are so easy to get?

Actually, that is one of the (intentional, IMHO) side benefits of HT. WIth HT, all applications are now running in a smp environment (at least from the testing/development point of view). So companies who say things like 'we haven't tested this on a SMP system' will no longer be able to really sell into the market as everyone will have such a machine.

As an example, when I worked on the GoBack purchase, Roxio had not yet formally tested the product on SMP systems. HT made SMP testing a required item even for consumer products, they had to go back verify their locking and certify that they worked before I would sign off on the technical side.

Bill


 
Well considering it will really only make a big differance on SMP systems anyways you're right in that it probably wont be that common since it doesnt much matter for single threaded apps (although as HT becomes more common I wouldnt be surprised to see more multi-threaded apps out there).
In either case I would expect that Photoshop 8 and Premiere 7 will have support for HT CPUs

Is HT enabled in regular P4s yet? I know the capability bit is there, but I thought it was disabled, or maybe I'm just thinking of SMP.

As an example, when I worked on the GoBack purchase, Roxio had not yet formally tested the product on SMP systems. HT made SMP testing a required item even for consumer products, they had to go back verify their locking and certify that they worked before I would sign off on the technical side.

I would bet that most companies will end up 'verifying their locking' by just having one big spinlock.
 
Is HT enabled in regular P4s yet? I know the capability bit is there, but I thought it was disabled, or maybe I'm just thinking of SMP.
Not most P4s, the only HT P4s that you can go buy right now that are HT are the 3.06Ghz ones, but Intel should be re-releasing some of the slower ones to be HT as well (and also 800Mhz FSB (4x200)) in the near future.

EDIT: I stand corrected the 800Mhz FBS w/ HT P4s are already out, Newegg has a 2.4Ghz 800Mhz FSB HT P4 for less than $200.

-Spy
 
Back
Top