Nemesis 1
Lifer
- Dec 30, 2006
- 11,366
- 2
- 0
This is Intels vex prefix basics,
The VEX prefix and VEX coding scheme is a proposed future extension to the x86 instruction set architecture for microprocessors from Intel, AMD and others.
Contents [hide]
1 Features
2 Technical description
3 History
4 References
[edit] FeaturesThe proposed VEX coding scheme extends the existing x86 instruction set architecture to allow the definition of new instructions and the extension or modification of previously existing instruction codes. This serves the following purposes:
The opcode map is extended to make space for future instructions.
It allows instruction codes to have up to five operands, where the original scheme allows only two operands (in rare cases three operands).
It allows the size of SIMD vector registers to be extended from the 128-bits XMM registers to 256-bits registers named YMM. There is room for further extensions of the register size in the future.
It allows existing two-operand instructions to be modified into non-destructive three-operand forms where the destination register is different from both source registers. For example c:=a+b instead of a:=a+b (where register a is changed by the instruction).
[edit] Technical descriptionThe proposed VEX coding scheme uses a code prefix consisting of 2 or 3 bytes which is added to existing or new instruction codes[1].
The VEX prefix replaces the most commonly used instruction prefix bytes and escape codes. In many cases, the number of prefix bytes and escape bytes that are replaced is the same as the number of bytes in the VEX prefix, so that the total length of the VEX-encoded instruction is the same as the length of the legacy instruction code. In other cases, the VEX-encoded version is longer or shorter than the legacy code.
The 3-bytes VEX prefix contains the following components:
The four bits R,X,B,W contained in the REX prefix used in the x86-64 instruction set extension.
Two bits named pp to replace operand size prefixes and operand type prefixes (66, F2, F3).
A bit named L specifying 256 bit vector length.
Four bits named vvvv specifying an second source register operand.
Five bits named m-mmmm. Two of the m bits are used for replacing existing escape codes and for specifying the length of the instruction. The remaining three m bits are reserved for future use, such as specifying vector lengths > 256 bits, specifying different instruction lengths, or extending the opcode space.
The 2-bytes VEX prefix contains a subset of these components and can be used in cases where not all components are needed.
The encoding is as follows:
First byte Second byte Third byte
7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
3-byte VEX 1 1 0 0 0 1 0 0 R̅ X̅ B̅ m m m m m W v v v v L p p
2-byte VEX 1 1 0 0 0 1 0 1 R̅ v v v v L p p
The R̅, X̅ and B̅ bits are equivalent to the REX prefix's R, X and B bits, providing a fourth register number bit for each of the three registers referenced by a standard x86 instruction: the register operand, and the index and base registers for the memory operand. The v̅ bits specify an additional source register, or are set to all-ones if not used. All of these bits are complemented in the instruction stream, so they are encoded as 1 bits in 32-bit mode.
The VEX opcode bytes are the same as that used by the LDS and LES instructions. These instructions are not supported in 64-bit mode, while in 32-bit mode, the following "mod R/M" byte can not be of the form "11xxxxxx" (which would specify a register operand). The bit inversion ensures that the second byte of a VEX prefix is always of this form in 32-bit mode.
The W bit is equivalent to the REX prefix's W bit, and specifies a 64-bit operand. For non-integer instructions, it is a general opcode extension bit.
The 5 m bits replace leading opcode bytes. The values 1, 2 and 3 are equivalent to opcodes 0F, 0F 38 and 0F 3A; all other values are currently reserved. (The 2-byte VEX prefix always corresponds to a 0F prefix.)
The L bit indicates the vector length. It is 0 for 128-bit SSE (xmm) registers, and 1 for 256-bit AVX (ymm) registers.
The p bits encode additional prefix bytes. The 4 possible values are none, 66, F3, and F2. These encode the operand type for SSE instructions: packed single, packed double, scalar single and scalar double, respectively.
Instructions that need more than three operands have an extra suffix byte specifying one or two additional register operands. Instructions coded with the VEX prefix can have up to five operands. At most one of the operands can be a memory operand; and at most one of the operands can be an immediate constant of 4 or 8 bits. The remaining operands are registers.
The AVX instruction set is the first instruction set extension to use the VEX coding scheme. The AVX instructions have up to four operands. The AVX instruction set allows the VEX prefix to be applied only to instructions using the SIMD XMM registers. However, the VEX coding scheme has space for applying the VEX prefix to other instructions as well in future instruction sets.
Legacy instructions with a VEX prefix added are equivalent to the same instructions without VEX prefix with the following differences:
The VEX-encoded instruction can have one more operand, making it non-destructive.
A 128-bit XMM instruction without VEX prefix leaves the upper half of the full 256-bit YMM register unchanged, while the VEX-encoded version sets the upper half to zero.
Instructions that use the whole 256-bit YMM register should not be mixed with non-VEX instructions that leave the upper half of the register unchanged, for reasons of efficiency.
From Wikipedia, the free encyclopedia
Jump to: navigation, search
The XOP instruction set, announced by AMD on May 1, 2009, is an extension to the 128-bit SSE core instructions in the x86 and AMD64 instruction set for the Bulldozer processor core, due to begin production in 2011.[1]
XOP is a revision of the SSE5 instruction set proposal announced on August 30, 2007. This revision makes the binary coding of the proposed new instructions more compatible with Intel's AVX instruction extensions, while the functionality of the instructions is unchanged.[2]
The XOP instructions include:
Integer vector multiply-accumulate instructions
Integer vector horizontal addition
Integer vector compare
Integer vector shift and rotate instructions
Vector byte permutation
Vector conditional move instructions
Floating point fraction extraction
The XOP instruction set is supplemented by the FMA4 (floating point vector multiply-accumulate) and CVT16 (Half precision floating point conversion) instruction sets, which were also included in SSE5.
[edit] Compatibility issuesAMD has changed the encoding from the original SSE5 specification in order to improve compatibility with Intel's AVX instruction set and the new VEX coding scheme.
All SSE5 instructions that were equivalent or similar to instructions in the AVX and FMA4 instruction sets announced by Intel have been changed to use the coding proposed by Intel. Integer instructions without equivalents in AVX were classified as the XOP extension.[3] The XOP instructions have an Opcode byte 8F (hexadecimal), but otherwise almost identical coding scheme as AVX with the 3-byte VEX prefix.
Commentators[4] have seen this as evidence that Intel has not allowed AMD to use any part of the large VEX coding space. AMD has been forced to use different codes in order to avoid using any code combination that Intel might possibly be using in their development pipeline for something else. The XOP coding scheme is as close to the VEX scheme as technically possible without risking that the AMD codes overlap with any future Intel codes. It must be noted that this inference is speculative, since no public information is available about negotiations between the two companies on this issue.
The use of the 8F byte requires that the m-bits (see VEX coding scheme) have a value bigger than or equal to 8 in order to avoid overlap with existing instructions. The C4 byte used in the VEX scheme has no such restriction. This may prevent the use of the m-bits for other purposes in the future in the XOP scheme, but not in the VEX scheme. Another possible problem is that the pp bits have the value 00 in the XOP scheme, while they have the value 01 in the VEX scheme for instructions that have no legacy equivalent. This may complicate the use of the pp bits for other purposes in the future.
A similar compatibility issue is the difference between the FMA3 and FMA4 instruction sets. INTEL initially proposed FMA4 in AVX/FMA specification version 3 to supersede the 3-operand FMA proposed by AMD in SSE5. After AMD adopted FMA4, however, Intel canceled FMA4 support and reverted back to FMA3 in the AVX/FMA specification version
This is a part a serve guy should get , Beings JW is a server guy . I ask him how this is done with AMDs prefix XOP
The VEX-encoded instruction can have one more operand, making it non-destructive.
A 128-bit XMM instruction without VEX prefix leaves the upper half of the full 256-bit YMM register unchanged, while the VEX-encoded version sets the upper half to zero.
Instructions that use the whole 256-bit YMM register should not be mixed with non-VEX instructions that leave the upper half of the register unchanged, for reasons of efficiency
The VEX prefix and VEX coding scheme is a proposed future extension to the x86 instruction set architecture for microprocessors from Intel, AMD and others.
Contents [hide]
1 Features
2 Technical description
3 History
4 References
[edit] FeaturesThe proposed VEX coding scheme extends the existing x86 instruction set architecture to allow the definition of new instructions and the extension or modification of previously existing instruction codes. This serves the following purposes:
The opcode map is extended to make space for future instructions.
It allows instruction codes to have up to five operands, where the original scheme allows only two operands (in rare cases three operands).
It allows the size of SIMD vector registers to be extended from the 128-bits XMM registers to 256-bits registers named YMM. There is room for further extensions of the register size in the future.
It allows existing two-operand instructions to be modified into non-destructive three-operand forms where the destination register is different from both source registers. For example c:=a+b instead of a:=a+b (where register a is changed by the instruction).
[edit] Technical descriptionThe proposed VEX coding scheme uses a code prefix consisting of 2 or 3 bytes which is added to existing or new instruction codes[1].
The VEX prefix replaces the most commonly used instruction prefix bytes and escape codes. In many cases, the number of prefix bytes and escape bytes that are replaced is the same as the number of bytes in the VEX prefix, so that the total length of the VEX-encoded instruction is the same as the length of the legacy instruction code. In other cases, the VEX-encoded version is longer or shorter than the legacy code.
The 3-bytes VEX prefix contains the following components:
The four bits R,X,B,W contained in the REX prefix used in the x86-64 instruction set extension.
Two bits named pp to replace operand size prefixes and operand type prefixes (66, F2, F3).
A bit named L specifying 256 bit vector length.
Four bits named vvvv specifying an second source register operand.
Five bits named m-mmmm. Two of the m bits are used for replacing existing escape codes and for specifying the length of the instruction. The remaining three m bits are reserved for future use, such as specifying vector lengths > 256 bits, specifying different instruction lengths, or extending the opcode space.
The 2-bytes VEX prefix contains a subset of these components and can be used in cases where not all components are needed.
The encoding is as follows:
First byte Second byte Third byte
7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
3-byte VEX 1 1 0 0 0 1 0 0 R̅ X̅ B̅ m m m m m W v v v v L p p
2-byte VEX 1 1 0 0 0 1 0 1 R̅ v v v v L p p
The R̅, X̅ and B̅ bits are equivalent to the REX prefix's R, X and B bits, providing a fourth register number bit for each of the three registers referenced by a standard x86 instruction: the register operand, and the index and base registers for the memory operand. The v̅ bits specify an additional source register, or are set to all-ones if not used. All of these bits are complemented in the instruction stream, so they are encoded as 1 bits in 32-bit mode.
The VEX opcode bytes are the same as that used by the LDS and LES instructions. These instructions are not supported in 64-bit mode, while in 32-bit mode, the following "mod R/M" byte can not be of the form "11xxxxxx" (which would specify a register operand). The bit inversion ensures that the second byte of a VEX prefix is always of this form in 32-bit mode.
The W bit is equivalent to the REX prefix's W bit, and specifies a 64-bit operand. For non-integer instructions, it is a general opcode extension bit.
The 5 m bits replace leading opcode bytes. The values 1, 2 and 3 are equivalent to opcodes 0F, 0F 38 and 0F 3A; all other values are currently reserved. (The 2-byte VEX prefix always corresponds to a 0F prefix.)
The L bit indicates the vector length. It is 0 for 128-bit SSE (xmm) registers, and 1 for 256-bit AVX (ymm) registers.
The p bits encode additional prefix bytes. The 4 possible values are none, 66, F3, and F2. These encode the operand type for SSE instructions: packed single, packed double, scalar single and scalar double, respectively.
Instructions that need more than three operands have an extra suffix byte specifying one or two additional register operands. Instructions coded with the VEX prefix can have up to five operands. At most one of the operands can be a memory operand; and at most one of the operands can be an immediate constant of 4 or 8 bits. The remaining operands are registers.
The AVX instruction set is the first instruction set extension to use the VEX coding scheme. The AVX instructions have up to four operands. The AVX instruction set allows the VEX prefix to be applied only to instructions using the SIMD XMM registers. However, the VEX coding scheme has space for applying the VEX prefix to other instructions as well in future instruction sets.
Legacy instructions with a VEX prefix added are equivalent to the same instructions without VEX prefix with the following differences:
The VEX-encoded instruction can have one more operand, making it non-destructive.
A 128-bit XMM instruction without VEX prefix leaves the upper half of the full 256-bit YMM register unchanged, while the VEX-encoded version sets the upper half to zero.
Instructions that use the whole 256-bit YMM register should not be mixed with non-VEX instructions that leave the upper half of the register unchanged, for reasons of efficiency.
From Wikipedia, the free encyclopedia
Jump to: navigation, search
The XOP instruction set, announced by AMD on May 1, 2009, is an extension to the 128-bit SSE core instructions in the x86 and AMD64 instruction set for the Bulldozer processor core, due to begin production in 2011.[1]
XOP is a revision of the SSE5 instruction set proposal announced on August 30, 2007. This revision makes the binary coding of the proposed new instructions more compatible with Intel's AVX instruction extensions, while the functionality of the instructions is unchanged.[2]
The XOP instructions include:
Integer vector multiply-accumulate instructions
Integer vector horizontal addition
Integer vector compare
Integer vector shift and rotate instructions
Vector byte permutation
Vector conditional move instructions
Floating point fraction extraction
The XOP instruction set is supplemented by the FMA4 (floating point vector multiply-accumulate) and CVT16 (Half precision floating point conversion) instruction sets, which were also included in SSE5.
[edit] Compatibility issuesAMD has changed the encoding from the original SSE5 specification in order to improve compatibility with Intel's AVX instruction set and the new VEX coding scheme.
All SSE5 instructions that were equivalent or similar to instructions in the AVX and FMA4 instruction sets announced by Intel have been changed to use the coding proposed by Intel. Integer instructions without equivalents in AVX were classified as the XOP extension.[3] The XOP instructions have an Opcode byte 8F (hexadecimal), but otherwise almost identical coding scheme as AVX with the 3-byte VEX prefix.
Commentators[4] have seen this as evidence that Intel has not allowed AMD to use any part of the large VEX coding space. AMD has been forced to use different codes in order to avoid using any code combination that Intel might possibly be using in their development pipeline for something else. The XOP coding scheme is as close to the VEX scheme as technically possible without risking that the AMD codes overlap with any future Intel codes. It must be noted that this inference is speculative, since no public information is available about negotiations between the two companies on this issue.
The use of the 8F byte requires that the m-bits (see VEX coding scheme) have a value bigger than or equal to 8 in order to avoid overlap with existing instructions. The C4 byte used in the VEX scheme has no such restriction. This may prevent the use of the m-bits for other purposes in the future in the XOP scheme, but not in the VEX scheme. Another possible problem is that the pp bits have the value 00 in the XOP scheme, while they have the value 01 in the VEX scheme for instructions that have no legacy equivalent. This may complicate the use of the pp bits for other purposes in the future.
A similar compatibility issue is the difference between the FMA3 and FMA4 instruction sets. INTEL initially proposed FMA4 in AVX/FMA specification version 3 to supersede the 3-operand FMA proposed by AMD in SSE5. After AMD adopted FMA4, however, Intel canceled FMA4 support and reverted back to FMA3 in the AVX/FMA specification version
This is a part a serve guy should get , Beings JW is a server guy . I ask him how this is done with AMDs prefix XOP
The VEX-encoded instruction can have one more operand, making it non-destructive.
A 128-bit XMM instruction without VEX prefix leaves the upper half of the full 256-bit YMM register unchanged, while the VEX-encoded version sets the upper half to zero.
Instructions that use the whole 256-bit YMM register should not be mixed with non-VEX instructions that leave the upper half of the register unchanged, for reasons of efficiency
Last edited: