• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

HyperThreading - Decoder/Instraction Cache being a Bootleneck?

YossI

Member
i have read the article about intels HyperThreading. very intresting indid.
however it seems to me that there maybe a limting factor utilizing HyperThreading on current P4:

1. Decoder - i've heard that the x86 Decoder on the P4 is being able to decode in a pattern called 4-1-1(4 instractions two microps?) if so... it might not utiliez the full power of the 7 excution units ?

2. instraction Cache - as i understand, evrything decoded by the p4 decoder has to get registerd in the limted (12k ?) instraction cache before it goes on through the rest of the pipline... wouldnt that be another problem dealing with extensive amount of instraction?

hmmm... should this post be at HT? dont know... no one seem to answer it on genral hardware

I hope ill get a few answers insights or stuff...
Thank you.
 


<< i have read the article about intels HyperThreading. very intresting indid.
however it seems to me that there maybe a limting factor utilizing HyperThreading on current P4:

1. Decoder - i've heard that the x86 Decoder on the P4 is being able to decode in a pattern called 4-1-1(4 instractions two microps?) if so... it might not utiliez the full power of the 7 excution units ?
>>



In fact, Pentium(R) 4 Processor have only a single scalar decoder.
4-1-1 is the pattern of P6 (Pentium Pro, PentiumII, PentiumIII). But the decoder works only if the trace cache misses.
P.S. If both threads have lots of dependent instruction, and some load misses L2 cache, performance will be bad...Memory wall is one of the most important considerations now.



<<
2. instraction Cache - as i understand, evrything decoded by the p4 decoder has to get registerd in the limted (12k ?) instraction cache before it goes on through the rest of the pipline... wouldnt that be another problem dealing with extensive amount of instraction?
>>



Yes.
The Pentium (R) 4 processor employs a very interesting structure called "trace cache". This cache stores decoded macroinstructions.



<<
hmmm... should this post be at HT? dont know... no one seem to answer it on genral hardware
>>



Yes.
🙂
 
Back
Top