besides benchmarking is there a program that can utalize all of a core I7

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
Yes, that works differently from the Pentium 4 and Core i7 that I was describing.
I believe it's a case of time-based multiplexing. On even cycles, it feeds instructions from thread 0, on odd cycles, it feeds instructions from thread 1, in a nutshell (Larrabee also does this, but it does it 4-way).

Atom has the same multi-threading technology as Pentium 4, and the Core ix. It works better on the Atom because Hyperthreading can act like out of order in some cases and reduce stalls.
 

Acanthus

Lifer
Aug 28, 2001
19,915
2
76
ostif.org
I run 85%-90% on all 4 cores in nero vision.

I'm bottlenecked by the disc subsystem.

If i had a larger raid array or a high end SSD i could peg it at 100%.
 

Scali

Banned
Dec 3, 2004
2,495
0
0
Atom has the same multi-threading technology as Pentium 4, and the Core ix. It works better on the Atom because Hyperthreading can act like out of order in some cases and reduce stalls.

Atom can't have the same technology as P4 and Core ix because it doesn't have out-of-order execution, and as such doesn't have a buffer to store micro-ops, and this is the buffer where HT is applied on P4/Core. I already explained how it worked anyway, so I'm not going to repeat myself.
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
27,261
16,119
136
I use all of my cores. 100% of the time F@H.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
Atom can't have the same technology as P4 and Core ix because it doesn't have out-of-order execution, and as such doesn't have a buffer to store micro-ops, and this is the buffer where HT is applied on P4/Core.

Ok, you might be right on the cycle switch based multi threading, but what you said doesn't make 100% sense to me. You are talking about the reorder buffer right? Well, SMT for Core i7 and Netburst does not replicate reorder buffer resources. Care to expand on that?
 

Scali

Banned
Dec 3, 2004
2,495
0
0
Ok, you might be right on the cycle switch based multi threading, but what you said doesn't make 100% sense to me. You are talking about the reorder buffer right? Well, SMT for Core i7 and Netburst does not replicate reorder buffer resources. Care to expand on that?

You don't need two reorder buffers for two threads.
You have a single execution pipeline, so you only need one buffer for feeding and retiring micro-ops.
As long as the registers are replicated for both threads (and well, not even literally, because we have register renaming... technically you don't need more physical registers, you just need the register renaming to be thread-aware), the results of each instruction will automatically be stored in the context of the proper thread.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
I know that.

You said due to lack of OoOE on the Atom it doesn't need buffer to store uops and that's related to SMT.

Because the reorder buffer is not replicated, then which buffer are you talking about?
 

Scali

Banned
Dec 3, 2004
2,495
0
0
I know that.

You said due to lack of OoOE on the Atom it doesn't need buffer to store uops and that's related to SMT.

Because the reorder buffer is not replicated, then which buffer are you talking about?

I think you're reading more into what I said than what I meant.
I said that the SMT implementation of P4/Core ix is based on the uop buffer (THE, as in one, I never claimed that any of that hardware had to be replicated). Two threads can feed instructions into the buffer at the same time, the rest is taken care of as usual.

Since Atom doesn't work with a uop buffer, but feeds instructions directly from the decoder into the pipeline, you need to implement SMT differently (it doesn't NEED to work with a uop buffer, that's just how the P4/Core ix implementation works).
They did this by having the threads decode and feed instructions to the pipeline in a time-multiplexed way. This way you don't need to buffer instructions, and you can continue to use a simple in-order pipeline.