>Actually everyone uses stuff that HT will be helpful, like simultaneously burning
> a CDR on an IDE-based CDRW drive and playing a Dvix at the same time.
>Its at sub-maximum performance where this will matter. You don't have to
>run 100% to fubar the stream and create a coaster.
Is there some kind of test that shows HT will reduce the likelyhood of a coaster? I can't see how it would. The CPU is loafing while you burn CDs. It has vitually nothing to do but wait. If another task contends for memory or the system bus, like another thread would, it is more likely these resources will not be available at the time the burner program needs them, increasing the possibility of a coaster. There is nothing hyperthreading can do to help that. The only thing provided by HT is possible use of unused CPU resources, which is a different thing.
As for burning CDs while you view DiVX, do people actual do this? It only takes 3 minutes to burn a CD. You'd barely get focused on the video before you'd be interrupted in order take the written CD out. If you are going to burn several CDs, it would be even worse. If these DivXs are porn, you would get frustrated very fast.
When you run two programs simultaneously, at least one should not require your attention, I would think. If you run two heavy duty programs simultaneous, you basicly accept that it doesn't matter how long they take.
I find it difficult to believe that anyone genuinely concerned with how long it takes to encode to DiVX would want to run some other program simultaneously which would make the encoding take longer.
Interestingly, DivX encoding is one thing that benefits quite a bit from HT according to Anandtech. (Single program with multi-thread.) One wonders why. Since the resources are totally controlled by the programmers, they could write their program to get results at least as good. All hyperthreading is doing is running instructions through the CPU, and the programmers could certainly do that and do it better. Programmers should study the HT results and figure out where they flubbed.
Games do not benefit from HT. Is that a surprise? No. Games are definately optimized to use CPU resources as effectively as possible. If you find a game that benefits from HT, you know the programmers have let you down.
If a programmer cannot beat the results of HT technology, he is not worth his salary. What HT can do that a programmer can't is use resources he has no control over, such as running two programs simultaneously.
Before people get too worked up about how AMD is going to respond, realize that in a sense AMD has already responded. The unused resources that HT is making use of are the exact same under-utilized resources that make the P4s average instructions per cycle so low compared to the Athlon. Athlons make use of these better than P4s without even using HT. HT just makes P4s not quite as bad. By the same token, if AMD were to add HT, there would be less opportunity to make use of unused resources.
Anandtech says that resource utilization is low in the average application because of "the nature of most applications." What is that nature? Most parts of an application are not optimized because they are judged "fast enough" by the manufacturer. For the average application that is ALL the parts, simply because processors are so fast. In some classes of applications there are a few slow parts. Programmers concentrate their efforts on these few parts. Then there are programs, such as 3D games, that spend nearly all their time doing things that must be as fast as possible, and programmers sweat blood optimizing them. It is exactly those applications that don't need to run any faster -in the judgement of manufacturers- that are the ones that benefit from HT.
Then we have some applications that perhaps do not conform to the theory. For media creation, either the manufacturers don't see speed as significant as some users seem to, or the programmers have areas in which they need to correct their oversights.
According to Anandtech, HT hardware is present and identical in slower chips than 3GHz that have been available for quite a while. Some say that HT hardware has been there for a couple of years. Yet Intel did not enable HT because it was not ready until now. What is it that wasn't ready? The hardware was ready. It must be the software. In other words, HT is not exactly a hardware solution, as it has been portrayed, although it does require hardware. There must be a piece of software that is doing a very difficult part of making HT work acceptably, a program which has taken years to get right. You can see that this is a difficult problem because this program is running simultaneously with the programs that are hyper-threading, and therefore taking up resources for itself.
In case people skipped over that part of the Anandtech article and didn't notice it, Windows since 95 has been multi-threaded. When you write a W95 style program you can run multiple threads. Thus you can do disk access interleaved with spell-checking, for instance, if your program will benefit from that. What Intel's HT adds is hardware-assisted threading, which should be more efficient, only it is not totally hardware. Windows XP evidently can select HT by using its built-in multi-processor support. It is interesting that XP Home DOES have multi-processor support; it just won't use it unless it detects Intels HT. Mighty neighborly of MS to allow this exception.
People are thinking AMDs 64 bit capability will compensate for HT, in the sense that programs could run faster. As one excellent FAQ points out, the advantages of 64 bits are very limited. This is simply because programs so seldom do operations on 64 bit data. Operations on 8 bits or fewer suffice for 95% of program code, I'd guess.
An important aspect of x86 instruction op-codes is that they can compact the data and address parts down to 16 or 8 bits when that is sufficent. I imagine AMDs extensions will do likewise for 64 bits, so the FAQ is wrong that accessing 64 bit instructions will slow things down by requiring more memory accesses for the same operations. Likewise there is no need to store 64 bits of data when you only need 8 bits just because the processor is a 64 bit processor. 64 bits will offer direct addressing, not segmented, beyond the 4G which 32 bits provides. The need for that is not far off. (I can remember when we wondered what we would do with 64K of memory and that amount of memory cost $500. It doesn't cost a lot more for 2G now.) And yes, over 4G will make some things run faster. Imagine a whole DVD in memory. and no disk access to slow down encoding. That's in a couple of years.
AMDs extensions provide more CPU registers. This will make a more convenient programming model. Currently the number of registers is so few that is futile to attempt to keep things around in registers. It is not really true that register operations are faster than memory operations; if the data is in the cache, it is just as fast. Of course with registers you do not have to worry if it is still in the cache or if it has been bumped out. But the x86 instruction set has lots of ops that do not treat memory the same as registers, and AMD's set is an extension. Beyond that, it seems to me, the CPU can do optimizations with registers that cannot be done with memory, if only because other things (video card, HD controller) can access memory beside the CPU. Current CPUs actully do have a lot of registers (for optimization use), while only a few are visible to the programmer. The new way of using registers should make things faster.
So what is the use of operations on 64 bits? Most of the operations will be done on addressing. Just as now, CPUs generally do more operations on addressing than they do operations that the instruction designates, and these operations will be 64 bits. In some ways, you get the 64 bit ALU operations as a tacked-on bonus just because you need to do 64 bit addressing. But SIMD and SSE2 provide almost all of the advantage of the P4 over the Athlon, when there is an advantage. What these instructions do is operate on multiple sets of data simultaneously. Instead of doing operations on two sets of 16 bit data one after another, it might put the sets side by side in a register to make 32 bit wide data sets, and do both operations simultaneously. Being able to do this with 64 bit registers instead of 32 should bring an advantage to Hammer, if in fact they do this. Being able to load 64 bits with one op instead of two is an advantage.
What being close to Intel's performance brings AMD is prestige. That prestige translates into a higher price for AMD's CPUs. In terms of sales, I guess we all know that a tiny part of the buying public buys top-of-the-line CPUs. Even the public knows that dropping down a few notches drops the price stupendously without dropping performance much. But without that prestige model, AMD looks second class, and they can't get the price they need to sustain a business. If AMD "loses contact" with Intel, like a runner in a race, they are going to have to start over, or switch to a different business stategy. You can't fake prestige -not for long. Although plenty of companies have made large, sustained profits by being a classy second, I don't see AMD changing strategy. AMD will get Hammer out pretty soon. It should be competitive with P4s. It should "maintain contact". They are going to call Clawhammers Athlons, I believe. For quite a while after that, most of the sales are going to be what we now think of as Athlons. Does Intel have something ready to go that will "pull away" from Hammer ? I wonder. They've done it before. Still, as Intel says, they have optimized x86 CPUs about as far as they can go.