Originally posted by: Lithan
Pulled off aces hardware. Daniel Rwizi author.
"> Last, I want to briefly address the topic in my heading--multithreaded
<snip>
so are you agreeing with him or disagreeing with him? i ask as while i don't agree with every conclusion he's made, his post, in basis agrees with what i've been saying (while you, of course, have been disagreeing with me the entire time w/o any substance):
"the HT circuitry increases the P4's per-clock working efficiency in the advent of multithreaded software, the performance within the multithreaded application improves as the internal per-clock thread multitasking performance of the cpu improves."
"The difference is that the A64's fundamental architecture is designed to run everything at maximum per-clock efficiency all the time and doesn't conditionally and functionally speed up per-clock multithreaded multitasking or slow down per clock while running single threads exclusively, as happens in the P4 HT-enabled cpu."
i'm surprised the first post containing any substance reaches the the same basic conclusion as i've reached.
a couple of points to consider further, is that at no time did i ever state HT was better or equal to dual core/dual cpu setup. i also agree with him in that,
"Dual cores from AMD and Intel are just far better SMT strategies than something like P4 HT ever was.."
although that's not what this discussion is all about. he makes another fundamentally sound statement when he writes,
"In short, two physical cpu cores are always much better than a single physical core & a logical core."
but that doesn't preclude the point of this thread, which is a single physical core & a logical core is better than a single physical core", at least in the context of the p4 as it has wasted pipeline resources which can be used to increase it's computational power when running more than one task, and again i quote his post,
"the HT circuitry increases the P4's per-clock working efficiency in the advent of multithreaded software, the performance within the multithreaded application improves as the internal per-clock thread multitasking performance of the cpu improves."
another important point of this which he doesn't mention is that it doesn't necessarily have a large impact on the first task either (tho this is where i agree with him on inconsistencies; it depends on the app and how it uses the cpu cycles).
of course, multithreaded software is much more prevelant than it was a year or two ago. at any rate, dual cores will render HT passe (well at least in considering NEW components, and depending on price of dual cores of course), and smp is a far better solution than a single core pretending to be 2, but in the meantime, it's quite obvious p4's architecture does offer some benefits (which will affect some more than others).
also, would be nice if you provided a link so we could easily see the comments others have made regarding his conclusions.
thanks for the post tho; interesting read.
ahh.. i did find the start of that thread.. gonna start reading it now. for those interested, you can find it
HERE
a couple points since following that thread. first, the auther was not Daniel Rwizi, rather it was By WaltC (or perhaps his real name is Daniel? at any rate, that was the name penned on the
original post).
the other point was a post in response to a
fundamental inaccuracy in WaltC's post:
"P4 SMT does run two threads at once. It may fetch from alternate threads each cycle, but it issues, dispatches, executes, and retires instructions from multiple threads each cycle."
while the last part of his statement about retirement seems erroneous (was stated in a followup that,
"P4 retirement alternates between logical processors", his basic argument is sound.
good stuff tho...