If you've described the scenario accurately, I assume you are using a hyperthreading P4 and have hyperthreading enabled.
The thing is that the client is poorly programmed for your way of using it, in various ways. The Athlon64 (like all CPUs prior to the P4C) does not have any threading in the CPU. The thread switching is done entirely by Windows' sheduler, and Windows only releases one thread to the cpu at any time.
This does not need to be any problem. The sheduler can switch thread every 20 millisecond, and run two (or more) threads smoothly, seemingly concurrent. The threads will run half as fast of course, but that shouldn't be much of a problem. Actually 100 concurrent threads work fine too, but by then you would notice each stop running, intermittently.
This is only the case when you have several threads with equal priority, running simultaneously. This is not the normal multitasking case. And Windows sheduler works the way, that it computes a new priority for every thread every time, then runs the one with highest priority. One of the things that results in a higher priority, is if the thread is a "Windows Message Queue" thread and that particular window happens to be active. This works fine for the typical multitasking scenario:
An interactive application typically uses very little CPU-time during work. Mostly it just waits for the user to press down a key or move the mouse. This means the unused CPU capacity can be used to perform a large chunk of computing, in the background, on spare CPU time.
On a well working software setup, where the background task has lower priority than the interactive, you will not notice the background threads in any way. The foreground, interactive app will be as responsive as if it is the only application running on the PC. But at the same time, the background job, will in reality use up close to 100% of the CPU-capacity. This is the setup you have if you're running something like seti@home at an enlightened, and smart setting, like "idle". Nothing magical about it. The sheduler runs SETI, while it's waiting for you to press the next key. When you do, it instantly shuts down SETI, and process your keyinput instead. That doesn't take long, and it instantly goes back to running SETI again.
Finally, when you're ready to let the interactive application do a chunk of lengthy work, like compiling, it will do so, and this time SETI will be mostly shut down for the duration.
The OS' shell should have even higher priority though, so now you can start/switch to another interactive application. If your Windows/OS is configured right, the active window will define the app with the highest priority. Now it is this app that will stay snappy and responsive, while the compiling will be the background task and use spare CPU time. And SETI still won't run much at all, until compiling has finished. (Windows should of course be configured to give priority to the foreground app.)
If you work this way, you really don't have much use for hyperthreading.
So what about hyperthreading? Hyperthreading basically means the CPU itself can do threading within. It works like the OS' sheduler thinks you've got dual cpus, and sends a second thread to the CPU. In the above example, SETI will also start to run. Unless of course, our background task (compiling) is using 2 threads, or we have 2 background apps. Then the OS sheduler still won't run the one with least priority much.
So basically, hyperthreading works like you have 2 CPUs, for multitasking purposes. There have to be something 'wrong' with your software setup, if you're noticing any benefit to responsiveness from HT. That shouldn't be in an ideal world. But apps are far from ideal. So you will in practice see a real benefit, from hyperthreading, under some circumstances. Same as two CPUs.
Much Windows unresponsiveness, is however due to Windows blocking threads for purposes of event synchronization and resource contention handling. Hyperthreading will have no effect whatsoever on that.
Hyperthreading is, not to forget, also a way to get more work out of a P4. A P4's execution units spends a lot of time waiting, not knowing what to do until more data has arrived. This time can be used to run a different thread instead. So hyperthreading, in a way, increases the performance of the P4, provided you are running more than one concurrent heavy thread. Again, same as a dual CPU setup, though lower performing. I'm not exactly sure how much additional work can be squeezed out of a P4 this way, but I think it's like 15%-25%. Computer 3D image rendition, like recent versions of 3DSmax, can make very good use of this.
Well now, back to your case. Why doesn't it work like it *should* on Windows sheduler? The situation is not completely uncommon. But it is not clear either, and calls for some speculation. For some reason, that has to do with the program model and structure of the app, the running thread stick to a higher priority, even though it's waiting for something, probably server respons, and thus doesn't allow the other client to run.
On your P4C on the other hand, Windows sheduler sends 2 threads to the CPU, so it can run the second client instead, even when it has lower priority.