If that was an Intel mobo, with an i865 chipset, then I don't think that there is a way to enable PAT, period, since Intel wanted it forcibly disabled on i865s. Some vendors found a workaround by fudging with the clocks and the "NBStrap" settings on the chipset, apparently Intel's lockout wasn't 100%. But I doubt that an Intel-made mobo would include either BIOS or jumper settings to fudge with the NBStrap settings.
As far as HT-caused CPU resource-contention, think of it this way. Take the example of a 1Ghz Pentium-III chip, with 256KB L2. Now, let's assume that you can interleave two tasks on the same chip, on alternate operations, but that they would have the entire chip's resources (including the L2) available to the task in question exclusively. Now, of course, this isn't possible in the real-world, but go with me, it's an illustration. That would give theoretical performance, for each tasks, of being equivalent to each task running on its own CPU with full 256KB of L2, but each CPU running at half of the clockspeed. Meaning, two 500Mhz P3s with 256KB L2, each running one task. Ingnoring real-world SMP issues, that would yield a hypothetical performance equivalent to one 1Ghz CPU running two tasks on alternate opcodes. This is the absolute best-case scenario, but impossible in the real-world, which is why SMP systems don't "add the Mhz".
Now let's take the case of HT-enabled P4s, when HT is enabled. Not only do both tasks have to share the same CPU Mhz, they also have to share certain limited resources that I mentioned above, which are similar to splitting and sharing the L2 cache in our hypothetical P3 example.
So in reality, it would be analogous to two tasks, each running on a 500Mhz P3 Celeron, with only 128KB L2 for each task to use. Some tasks perform largely the same, whether running on a real P3 with 256KB L2, vs. on a P3 Celly with 128KB L2. Other tasks, absolutely tank in performance, because they start to thrash the L2 cache.
So for those sorts of tasks, running on a P4 system with HT enabled, there can indeed be a significant loss in performance, over running on a non-HT-enabled P4.
For those that would claim that there is essentially never a performance hit for enabling HT on the P4, would you also make the claim that there is essentially no performance difference, replacing a 1Ghz P3 with a 1Ghz P3 Celeron with half the L2? Or would you be willing to admit that some programs will indeed suffer in performance with that switch?