I was careful in my original statement. Optimizing for the Pentium pipeline would not impact compatibility on any older processor. This is different from using new instructions (like MMX or SSE) which would fail on the other hardware.
Example:
Non-optimized code, written in "thought order":
MOV EAX,[1234]
ADD EAX,EBX
LEA EDX,[EBP+2222]
MOV [EDX],EAX
MOV EBX,[6789]
....
The last instruction is part of a new sequence from a different high-level language statement. This would be better as:
MOV EAX,[1234]
ADD EAX,EBX
LEA EDX,[EBP+2222]
MOV EBX,[6789]
MOV [EDX],EAX
....
because in the first example the pipeline would have stalled waiting for the result of the LEA instruction (and maybe even the ADD instruction) to execute the MOV [EDX], EAX. Moving the MOV EBX,[6789] up avoids the pipeline stall without penalty for any CPU.
A friend suggested that some possible problems in code optimization, perhaps in obscure circumstances, could cause the OS development group to forgo optimization in general.
To comment on other posts in the thread, this lack of optimization definitely impacts on OS performance.