This is from mitosis PDF .
Identifying Live-ins
Identifying the live-ins of a speculative thread requires a topdown
traversal of its control-flow graph starting at the CQIP to
identify register and memory values read before being written by
the speculative thread. Each path is explored until a certain length.
This length represents the time that previous threads take to
compute and commit these values. This is because once the
previous thread commits, the speculative thread need no longer
rely on predicted values, but can read committed values. This time
is estimated as the time it takes to sequentially execute all the
code between the SP and CQIP minus the thread spawn overhead
2 things I want you to retain here is Greater than or Less than and code length.
------------------------------------------------------------------------------------
Flexible and more compact bit fields are provided in the VEX prefix to retain the
full functionality provided by REX prefix. REX.W, REX.X, REX.B functionalities are
provided in the three-byte VEX prefix only because only a subset of SIMD instructions
need them.
I hope this helps the people with reading comprehension problems . IF not. OH well!
This defines prefix Vex . This is a P-slice There is still one more example thats in more simple terms . But thats for later.
_________________________________________________________________________________________________________________________________________________
This is from Mitosis PDF
value prediction. The
synchronization approach imposes a high overhead when
dependences are frequent, as in the workload presented here.
Value prediction has more potential – if the values that are
computed by one thread and consumed by another can be
predicted, the consumer thread can be executed in parallel with
the producer thread since these values are only needed for
validation at a later stage. It is typically assumed that these value
predictions are computed in hardware. The Mitosis system
presents a novel approach, which adds code (derived from the
original program) to predict in software the live-ins (values
consumed, but not produced by, the thread) for each speculative
thread. Because mechanisms for recovery of incorrect threads are
already in place, the code to produce the values need not always
be correct, and can be highly optimized. We refer to this code as
pre-computation slices (p-slices). The main advantages of p-slices
are: (1) they are potentially more accurate in the prediction of
live-ins than a hardware-based predictor, since it is derived from
the original code, (2) they can encapsulate multiple control flows
that contribute to the prediction of live-ins, and (3) they can
accelerate the detection of incorrectly spawned threads.
Unfortunitly one of my post was removed that really ties it together . I did however copy that page as I did all these pages to CD . When wife gets home I will ask her what folder she put it in as its not in the index yet on the research computer.
________________________________________________________________________________________________________________________________________
Other than the missing post this is the last post in this reply . Note it shows the rex prefix code inside the the Vexprefix. THat is a P-slice when used with the compiler
3bytes vexprefix
The four bits R,X,B,W contained in the REX prefix used in the x86-64 instruction set extension.
Two bits named pp to replace operand size prefixes and operand type prefixes (66, F2, F3).
A bit named L specifying 256 bit vector length.
Four bits named vvvv specifying an second source register operand.
Five bits named m-mmmm. Two of the m bits are used for replacing existing escape codes and for specifying the length of the instruction. The remaining three m bits are reserved for future use, such as specifying vector lengths > 256 bits, specifying different instruction lengths, or extending the opcode space.
The 2-bytes VEX prefix contains a subset of these components and can be used in cases where not all components are needed.
The encoding is as follows:
First byte Second byte Third byte
7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
3-byte VEX 1 1 0 0 0 1 0 0 R̅ X̅ B̅ m m m m m W v v v v L p p
2-byte VEX 1 1 0 0 0 1 0 1 R̅ v v v v L p p
The R̅, X̅ and B̅ bits are equivalent to the REX prefix's R, X and B bits, providing a fourth register number bit for each of the three registers referenced by a standard x86 instruction: the register operand, and the index and base registers for the memory operand. The v̅ bits specify an additional source register, or are set to all-ones if not used. All of these bits are complemented in the instruction stream, so they are encoded as 1 bits in 32-bit mode.
The VEX opcode bytes are the same as that used by the LDS and LES instructions. These instructions are not supported in 64-bit mode, while in 32-bit mode, the following "mod R/M" byte can not be of the form "11xxxxxx" (which would specify a register operand). The bit inversion ensures that the second byte of a VEX prefix is always of this form in 32-bit mode.
The W bit is equivalent to the REX prefix's W bit, and specifies a 64-bit operand. For non-integer instructions, it is a general opcode extension bit.
The 5 m bits replace leading opcode bytes. The values 1, 2 and 3 are equivalent to opcodes 0F, 0F 38 and 0F 3A; all other values are currently reserved. (The 2-byte VEX prefix always corresponds to a 0F prefix.)
The L bit indicates the vector length. It is 0 for 128-bit SSE (xmm) registers, and 1 for 256-bit AVX (ymm) registers.
The p bits encode additional prefix bytes. The 4 possible values are none, 66, F3, and F2. These encode the operand type for SSE instructions: packed single, packed double, scalar single and scalar double, respectively.
Instructions that need more than three operands have an extra suffix byte specifying one or two additional register operands. Instructions coded with the VEX prefix can have up to five operands. At most one of the operands can be a memory operand; and at most one of the operands can be an immediate constant of 4 or 8 bits. The remaining operands are registers.
The AVX instruction set is the first instruction set extension to use the VEX coding scheme. The AVX instructions have up to four operands. The AVX instruction set allows the VEX prefix to be applied only to instructions using the SIMD XMM registers. However, the VEX coding scheme has space for applying the VEX prefix to other instructions as well in future instruction sets.
Legacy instructions with a VEX prefix added are equivalent to the same instructions without VEX prefix with the following differences:
The VEX-encoded instruction can have one more operand, making it non-destructive.
A 128-bit XMM instruction without VEX prefix leaves the upper half of the full 256-bit YMM register unchanged, while the VEX-encoded version sets the upper half to zero.
Instructions that use the whole 256-bit YMM register should not be mixed with non-VEX instructions that leave the upper half of the register unchanged, for reasons of efficiency