style="display:inline-block;width:300px;height:250px"
data-ad-client="ca-pub-5935214489160196"
data-ad-slot="8007533899">

LLVM Assembly Language

From http://llvm.org/docs/LangRef.html

指令

說明

Terminator Instructions
ret Syntax:

ret <type> <value> ; Return a value from a non-void function

ret void ; Return from void function

Overview:

The ‘ret’ instruction is used to return control flow (and optionally a value) from a function back to the caller.

There are two forms of the ‘ret’ instruction: one that returns a value and then causes control flow, and one that just causes control flow to occur.

Arguments:

The ‘ret’ instruction optionally accepts a single argument, the return value. The type of the return value must be a ‘first class’ type.

A function is not well formed if it it has a non-void return type and contains a ‘ret’ instruction with no return value or a return value with a type that does not match its type, or if it has a void return type and contains a ‘ret’ instruction with a return value.

Semantics:

When the ‘ret’ instruction is executed, control flow returns back to the calling function’s context. If the caller is a “call” instruction, execution continues at the instruction after the call. If the caller was an “invoke” instruction, execution continues at the beginning of the “normal” destination block. If the instruction returns a value, that value shall set the call or invoke instruction’s return value.

Example:

ret i32 5 ; Return an integer value of 5

ret void ; Return from a void function

ret { i32, i8 } { i32 4, i8 2 } ; Return a struct of values 4 and 2

br Syntax:

br i1 <cond>, label <iftrue>, label <iffalse>

br label <dest> ; Unconditional branch

Overview:

The ‘br’ instruction is used to cause control flow to transfer to a different basic block in the current function. There are two forms of this instruction, corresponding to a conditional branch and an unconditional branch.

Arguments:

The conditional branch form of the ‘br’ instruction takes a single ‘i1′ value and two ‘label’ values. The unconditional form of the ‘br’ instruction takes a single ‘label’ value as a target.

Semantics:

Upon execution of a conditional ‘br’ instruction, the ‘i1′ argument is evaluated. If the value is true, control flows to the ‘iftrue’ label argument. If “cond” is false, control flows to the ‘iffalse’ label argument.

Example:

Test:

%cond = icmp eq i32 %a, %b

br i1 %cond, label %IfEqual, label %IfUnequal

IfEqual:

ret i32 1

IfUnequal:

ret i32 0

switch Syntax:

switch <intty> <value>, label <defaultdest> [ <intty> <val>, label <dest> … ]

Overview:

The ‘switch’ instruction is used to transfer control flow to one of several different places. It is a generalization of the ‘br’ instruction, allowing a branch to occur to one of many possible destinations.

Arguments:

The ‘switch’ instruction uses three parameters: an integer comparison value ‘value’, a default ‘label’ destination, and an array of pairs of comparison value constants and ‘label’s. The table is not allowed to contain duplicate constant entries.

Semantics:

The switch instruction specifies a table of values and destinations. When the ‘switch’ instruction is executed, this table is searched for the given value. If the value is found, control flow is transferred to the corresponding destination; otherwise, control flow is transferred to the default destination.

Implementation:

Depending on properties of the target machine and the particular switch instruction, this instruction may be code generated in different ways. For example, it could be generated as a series of chained conditional branches or with a lookup table.

Example:

; Emulate a conditional br instruction

%Val = zext i1 %value to i32

switch i32 %Val, label %truedest [ i32 0, label %falsedest ]

; Emulate an unconditional br instruction

switch i32 0, label %dest [ ]

; Implement a jump table:

switch i32 %val, label %otherwise [ i32 0, label %onzero

i32 1, label %onone

i32 2, label %ontwo ]

indirectbr Syntax:

indirectbr <somety>* <address>, [ label <dest1>, label <dest2>, … ]

Overview:

The ‘indirectbr’ instruction implements an indirect branch to a label within the current function, whose address is specified by “address”. Address must be derived from a blockaddress constant.

Arguments:

The ‘address’ argument is the address of the label to jump to. The rest of the arguments indicate the full set of possible destinations that the address may point to. Blocks are allowed to occur multiple times in the destination list, though this isn’t particularly useful.

This destination list is required so that dataflow analysis has an accurate understanding of the CFG.

Semantics:

Control transfers to the block specified in the address argument. All possible destination blocks must be listed in the label list, otherwise this instruction has undefined behavior. This implies that jumps to labels defined in other functions have undefined behavior as well.

Implementation:

This is typically implemented with a jump through a register.

Example:

indirectbr i8* %Addr, [ label %bb1, label %bb2, label %bb3 ]

invoke Syntax:

<result> = invoke [cconv] [ret attrs] <ptr to function ty> <function ptr val>(<function args>) [fn attrs]

to label <normal label> unwind label <exception label>

Overview:

The ‘invoke’ instruction causes control to transfer to a specified function, with the possibility of control flow transfer to either the ‘normal’ label or the ‘exception’ label. If the callee function returns with the “ret” instruction, control flow will return to the “normal” label. If the callee (or any indirect callees) returns via the “resume” instruction or other exception handling mechanism, control is interrupted and continued at the dynamically nearest “exception” label.

The ‘exception’ label is a landing pad for the exception. As such, ‘exception’ label is required to have the “landingpad” instruction, which contains the information about the behavior of the program after unwinding happens, as its first non-PHI instruction. The restrictions on the “landingpad” instruction’s tightly couples it to the “invoke” instruction, so that the important information contained within the “landingpad” instruction can’t be lost through normal code motion.

Arguments:

This instruction requires several arguments:

The optional “cconv” marker indicates which calling convention the call should use. If none is specified, the call defaults to using C calling conventions.

The optional Parameter Attributes list for return values. Only ‘zeroext’, ‘signext’, and ‘inreg’ attributes are valid here.

‘ptr to function ty': shall be the signature of the pointer to function value being invoked. In most cases, this is a direct function invocation, but indirect invokes are just as possible, branching off an arbitrary pointer to function value.

‘function ptr val': An LLVM value containing a pointer to a function to be invoked.

‘function args': argument list whose types match the function signature argument types and parameter attributes. All arguments must be of first class type. If the function signature indicates the function accepts a variable number of arguments, the extra arguments can be specified.

‘normal label': the label reached when the called function executes a ‘ret’ instruction.

‘exception label': the label reached when a callee returns via the resume instruction or other exception handling mechanism.

The optional function attributes list. Only ‘noreturn’, ‘nounwind’, ‘readonly’ and ‘readnone’ attributes are valid here.

Semantics:

This instruction is designed to operate as a standard ‘call’ instruction in most regards. The primary difference is that it establishes an association with a label, which is used by the runtime library to unwind the stack.

This instruction is used in languages with destructors to ensure that proper cleanup is performed in the case of either a longjmp or a thrown exception. Additionally, this is important for implementation of ‘catch’ clauses in high-level languages that support them.

For the purposes of the SSA form, the definition of the value returned by the ‘invoke’ instruction is deemed to occur on the edge from the current block to the “normal” label. If the callee unwinds then no return value is available.

Example:

%retval = invoke i32 @Test(i32 15) to label %Continue

unwind label %TestCleanup ; {i32}:retval set

%retval = invoke coldcc i32 %Testfnptr(i32 15) to label %Continue

unwind label %TestCleanup ; {i32}:retval set

resume Syntax:

resume <type> <value>

Overview:

The ‘resume’ instruction is a terminator instruction that has no successors.

Arguments:

The ‘resume’ instruction requires one argument, which must have the same type as the result of any ‘landingpad’ instruction in the same function.

Semantics:

The ‘resume’ instruction resumes propagation of an existing (in-flight) exception whose unwinding was interrupted with a landingpad instruction.

Example:

resume { i8*, i32 } %exn

unreachable Syntax:

unreachable

Overview:

The ‘unreachable’ instruction has no defined semantics. This instruction is used to inform the optimizer that a particular portion of the code is not reachable. This can be used to indicate that the code after a no-return function cannot be reached, and other facts.

Semantics:

The ‘unreachable’ instruction has no defined semantics.

Binary Operations
add Syntax:

<result> = add <ty> <op1>, <op2> ; yields {ty}:result

<result> = add nuw <ty> <op1>, <op2> ; yields {ty}:result

<result> = add nsw <ty> <op1>, <op2> ; yields {ty}:result

<result> = add nuw nsw <ty> <op1>, <op2> ; yields {ty}:result

Overview:

The ‘add’ instruction returns the sum of its two operands.

Arguments:

The two arguments to the ‘add’ instruction must be integer or vector of integer values. Both arguments must have identical types.

Semantics:

The value produced is the integer sum of the two operands.

If the sum has unsigned overflow, the result returned is the mathematical result modulo 2n, where n is the bit width of the result.

Because LLVM integers use a two’s complement representation, this instruction is appropriate for both signed and unsigned integers.

nuw and nsw stand for “No Unsigned Wrap” and “No Signed Wrap”, respectively. If the nuw and/or nsw keywords are present, the result value of the add is a poison value if unsigned and/or signed overflow, respectively, occurs.

Example:

<result> = add i32 4, %var ; yields {i32}:result = 4 + %var

fadd Syntax:

<result> = fadd <ty> <op1>, <op2> ; yields {ty}:result

Overview:

The ‘fadd’ instruction returns the sum of its two operands.

Arguments:

The two arguments to the ‘fadd’ instruction must be floating point or vector of floating point values. Both arguments must have identical types.

Semantics:

The value produced is the floating point sum of the two operands.

Example:

<result> = fadd float 4.0, %var ; yields {float}:result = 4.0 + %var

sub Syntax:

<result> = sub <ty> <op1>, <op2> ; yields {ty}:result

<result> = sub nuw <ty> <op1>, <op2> ; yields {ty}:result

<result> = sub nsw <ty> <op1>, <op2> ; yields {ty}:result

<result> = sub nuw nsw <ty> <op1>, <op2> ; yields {ty}:result

Overview:

The ‘sub’ instruction returns the difference of its two operands.

Note that the ‘sub’ instruction is used to represent the ‘neg’ instruction present in most other intermediate representations.

Arguments:

The two arguments to the ‘sub’ instruction must be integer or vector of integer values. Both arguments must have identical types.

Semantics:

The value produced is the integer difference of the two operands.

If the difference has unsigned overflow, the result returned is the mathematical result modulo 2n, where n is the bit width of the result.

Because LLVM integers use a two’s complement representation, this instruction is appropriate for both signed and unsigned integers.

nuw and nsw stand for “No Unsigned Wrap” and “No Signed Wrap”, respectively. If the nuw and/or nsw keywords are present, the result value of the sub is a poison value if unsigned and/or signed overflow, respectively, occurs.

Example:

<result> = sub i32 4, %var ; yields {i32}:result = 4 – %var

<result> = sub i32 0, %val ; yields {i32}:result = -%var

fsub Syntax:

<result> = fsub <ty> <op1>, <op2> ; yields {ty}:result

Overview:

The ‘fsub’ instruction returns the difference of its two operands.

Note that the ‘fsub’ instruction is used to represent the ‘fneg’ instruction present in most other intermediate representations.

Arguments:

The two arguments to the ‘fsub’ instruction must be floating point or vector of floating point values. Both arguments must have identical types.

Semantics:

The value produced is the floating point difference of the two operands.

Example:

<result> = fsub float 4.0, %var ; yields {float}:result = 4.0 – %var

<result> = fsub float -0.0, %val ; yields {float}:result = -%var

mul Syntax:

<result> = mul <ty> <op1>, <op2> ; yields {ty}:result

<result> = mul nuw <ty> <op1>, <op2> ; yields {ty}:result

<result> = mul nsw <ty> <op1>, <op2> ; yields {ty}:result

<result> = mul nuw nsw <ty> <op1>, <op2> ; yields {ty}:result

Overview:

The ‘mul’ instruction returns the product of its two operands.

Arguments:

The two arguments to the ‘mul’ instruction must be integer or vector of integer values. Both arguments must have identical types.

Semantics:

The value produced is the integer product of the two operands.

If the result of the multiplication has unsigned overflow, the result returned is the mathematical result modulo 2n, where n is the bit width of the result.

Because LLVM integers use a two’s complement representation, and the result is the same width as the operands, this instruction returns the correct result for both signed and unsigned integers. If a full product (e.g. i32xi32->i64) is needed, the operands should be sign-extended or zero-extended as appropriate to the width of the full product.

nuw and nsw stand for “No Unsigned Wrap” and “No Signed Wrap”, respectively. If the nuw and/or nsw keywords are present, the result value of the mul is a poison value if unsigned and/or signed overflow, respectively, occurs.

Example:

<result> = mul i32 4, %var ; yields {i32}:result = 4 * %var

fmul Syntax:

<result> = fmul <ty> <op1>, <op2> ; yields {ty}:result

Overview:

The ‘fmul’ instruction returns the product of its two operands.

Arguments:

The two arguments to the ‘fmul’ instruction must be floating point or vector of floating point values. Both arguments must have identical types.

Semantics:

The value produced is the floating point product of the two operands.

Example:

<result> = fmul float 4.0, %var ; yields {float}:result = 4.0 * %var

udiv Syntax:

<result> = udiv <ty> <op1>, <op2> ; yields {ty}:result

<result> = udiv exact <ty> <op1>, <op2> ; yields {ty}:result

Overview:

The ‘udiv’ instruction returns the quotient of its two operands.

Arguments:

The two arguments to the ‘udiv’ instruction must be integer or vector of integer values. Both arguments must have identical types.

Semantics:

The value produced is the unsigned integer quotient of the two operands.

Note that unsigned integer division and signed integer division are distinct operations; for signed integer division, use ‘sdiv’.

Division by zero leads to undefined behavior.

If the exact keyword is present, the result value of the udiv is a poison value if %op1 is not a multiple of %op2 (as such, “((a udiv exact b) mul b) == a”).

Example:

<result> = udiv i32 4, %var ; yields {i32}:result = 4 / %var

sdiv Syntax:

<result> = sdiv <ty> <op1>, <op2> ; yields {ty}:result

<result> = sdiv exact <ty> <op1>, <op2> ; yields {ty}:result

Overview:

The ‘sdiv’ instruction returns the quotient of its two operands.

Arguments:

The two arguments to the ‘sdiv’ instruction must be integer or vector of integer values. Both arguments must have identical types.

Semantics:

The value produced is the signed integer quotient of the two operands rounded towards zero.

Note that signed integer division and unsigned integer division are distinct operations; for unsigned integer division, use ‘udiv’.

Division by zero leads to undefined behavior. Overflow also leads to undefined behavior; this is a rare case, but can occur, for example, by doing a 32-bit division of -2147483648 by -1.

If the exact keyword is present, the result value of the sdiv is a poison value if the result would be rounded.

Example:

<result> = sdiv i32 4, %var ; yields {i32}:result = 4 / %var

fdiv Syntax:

<result> = fdiv <ty> <op1>, <op2> ; yields {ty}:result

Overview:

The ‘fdiv’ instruction returns the quotient of its two operands.

Arguments:

The two arguments to the ‘fdiv’ instruction must be floating point or vector of floating point values. Both arguments must have identical types.

Semantics:

The value produced is the floating point quotient of the two operands.

Example:

<result> = fdiv float 4.0, %var ; yields {float}:result = 4.0 / %var

urem Syntax:

<result> = urem <ty> <op1>, <op2> ; yields {ty}:result

Overview:

The ‘urem’ instruction returns the remainder from the unsigned division of its two arguments.

Arguments:

The two arguments to the ‘urem’ instruction must be integer or vector of integer values. Both arguments must have identical types.

Semantics:

This instruction returns the unsigned integer remainder of a division. This instruction always performs an unsigned division to get the remainder.

Note that unsigned integer remainder and signed integer remainder are distinct operations; for signed integer remainder, use ‘srem’.

Taking the remainder of a division by zero leads to undefined behavior.

Example:

<result> = urem i32 4, %var ; yields {i32}:result = 4 % %var

srem Syntax:

<result> = srem <ty> <op1>, <op2> ; yields {ty}:result

Overview:

The ‘srem’ instruction returns the remainder from the signed division of its two operands. This instruction can also take vector versions of the values in which case the elements must be integers.

Arguments:

The two arguments to the ‘srem’ instruction must be integer or vector of integer values. Both arguments must have identical types.

Semantics:

This instruction returns the remainder of a division (where the result is either zero or has the same sign as the dividend, op1), not the modulo operator (where the result is either zero or has the same sign as the divisor, op2) of a value. For more information about the difference, see The Math Forum. For a table of how this is implemented in various languages, please see Wikipedia: modulo operation.

Note that signed integer remainder and unsigned integer remainder are distinct operations; for unsigned integer remainder, use ‘urem’.

Taking the remainder of a division by zero leads to undefined behavior. Overflow also leads to undefined behavior; this is a rare case, but can occur, for example, by taking the remainder of a 32-bit division of -2147483648 by -1. (The remainder doesn’t actually overflow, but this rule lets srem be implemented using instructions that return both the result of the division and the remainder.)

Example:

<result> = srem i32 4, %var ; yields {i32}:result = 4 % %var

frem Syntax:

<result> = frem <ty> <op1>, <op2> ; yields {ty}:result

Overview:

The ‘frem’ instruction returns the remainder from the division of its two operands.

Arguments:

The two arguments to the ‘frem’ instruction must be floating point or vector of floating point values. Both arguments must have identical types.

Semantics:

This instruction returns the remainder of a division. The remainder has the same sign as the dividend.

Example:

<result> = frem float 4.0, %var ; yields {float}:result = 4.0 % %var

Bitwise Binary Operations
shl Syntax:

<result> = shl <ty> <op1>, <op2> ; yields {ty}:result

<result> = shl nuw <ty> <op1>, <op2> ; yields {ty}:result

<result> = shl nsw <ty> <op1>, <op2> ; yields {ty}:result

<result> = shl nuw nsw <ty> <op1>, <op2> ; yields {ty}:result

Overview:

The ‘shl’ instruction returns the first operand shifted to the left a specified number of bits.

Arguments:

Both arguments to the ‘shl’ instruction must be the same integer or vector of integer type. ‘op2′ is treated as an unsigned value.

Semantics:

The value produced is op1 * 2op2 mod 2n, where n is the width of the result. If op2 is (statically or dynamically) negative or equal to or larger than the number of bits in op1, the result is undefined. If the arguments are vectors, each vector element of op1 is shifted by the corresponding shift amount in op2.

If the nuw keyword is present, then the shift produces a poison value if it shifts out any non-zero bits. If the nsw keyword is present, then the shift produces a poison value if it shifts out any bits that disagree with the resultant sign bit. As such, NUW/NSW have the same semantics as they would if the shift were expressed as a mul instruction with the same nsw/nuw bits in (mul %op1, (shl 1, %op2)).

Example:

<result> = shl i32 4, %var ; yields {i32}: 4 << %var

<result> = shl i32 4, 2 ; yields {i32}: 16

<result> = shl i32 1, 10 ; yields {i32}: 1024

<result> = shl i32 1, 32 ; undefined

<result> = shl <2 x i32> < i32 1, i32 1>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 2, i32 4>

lshr Syntax:

<result> = lshr <ty> <op1>, <op2> ; yields {ty}:result

<result> = lshr exact <ty> <op1>, <op2> ; yields {ty}:result

Overview:

The ‘lshr’ instruction (logical shift right) returns the first operand shifted to the right a specified number of bits with zero fill.

Arguments:

Both arguments to the ‘lshr’ instruction must be the same integer or vector of integer type. ‘op2′ is treated as an unsigned value.

Semantics:

This instruction always performs a logical shift right operation. The most significant bits of the result will be filled with zero bits after the shift. If op2 is (statically or dynamically) equal to or larger than the number of bits in op1, the result is undefined. If the arguments are vectors, each vector element of op1 is shifted by the corresponding shift amount in op2.

If the exact keyword is present, the result value of the lshr is a poison value if any of the bits shifted out are non-zero.

Example:

<result> = lshr i32 4, 1 ; yields {i32}:result = 2

<result> = lshr i32 4, 2 ; yields {i32}:result = 1

<result> = lshr i8 4, 3 ; yields {i8}:result = 0

<result> = lshr i8 -2, 1 ; yields {i8}:result = 0x7FFFFFFF

<result> = lshr i32 1, 32 ; undefined

<result> = lshr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 0x7FFFFFFF, i32 1>

ashr Syntax:

<result> = ashr <ty> <op1>, <op2> ; yields {ty}:result

<result> = ashr exact <ty> <op1>, <op2> ; yields {ty}:result

Overview:

The ‘ashr’ instruction (arithmetic shift right) returns the first operand shifted to the right a specified number of bits with sign extension.

Arguments:

Both arguments to the ‘ashr’ instruction must be the same integer or vector of integer type. ‘op2′ is treated as an unsigned value.

Semantics:

This instruction always performs an arithmetic shift right operation, The most significant bits of the result will be filled with the sign bit of op1. If op2 is (statically or dynamically) equal to or larger than the number of bits in op1, the result is undefined. If the arguments are vectors, each vector element of op1 is shifted by the corresponding shift amount in op2.

If the exact keyword is present, the result value of the ashr is a poison value if any of the bits shifted out are non-zero.

Example:

<result> = ashr i32 4, 1 ; yields {i32}:result = 2

<result> = ashr i32 4, 2 ; yields {i32}:result = 1

<result> = ashr i8 4, 3 ; yields {i8}:result = 0

<result> = ashr i8 -2, 1 ; yields {i8}:result = -1

<result> = ashr i32 1, 32 ; undefined

<result> = ashr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 3> ; yields: result=<2 x i32> < i32 -1, i32 0>

and Syntax:

<result> = and <ty> <op1>, <op2> ; yields {ty}:result

Overview:

The ‘and’ instruction returns the bitwise logical and of its two operands.

Arguments:

The two arguments to the ‘and’ instruction must be integer or vector of integer values. Both arguments must have identical types.

Semantics:

The truth table used for the ‘and’ instruction is:

In0 In1 Out

0 0 0

0 1 0

1 0 0

1 1 1

Example:

<result> = and i32 4, %var ; yields {i32}:result = 4 & %var

<result> = and i32 15, 40 ; yields {i32}:result = 8

<result> = and i32 4, 8 ; yields {i32}:result = 0

or Syntax:

<result> = or <ty> <op1>, <op2> ; yields {ty}:result

Overview:

The ‘or’ instruction returns the bitwise logical inclusive or of its two operands.

Arguments:

The two arguments to the ‘or’ instruction must be integer or vector of integer values. Both arguments must have identical types.

Semantics:

The truth table used for the ‘or’ instruction is:

In0 In1 Out

0 0 0

0 1 1

1 0 1

1 1 1

Example:

<result> = or i32 4, %var ; yields {i32}:result = 4 | %var

<result> = or i32 15, 40 ; yields {i32}:result = 47

<result> = or i32 4, 8 ; yields {i32}:result = 12

xor Syntax:

<result> = xor <ty> <op1>, <op2> ; yields {ty}:result

Overview:

The ‘xor’ instruction returns the bitwise logical exclusive or of its two operands. The xor is used to implement the “one’s complement” operation, which is the “~” operator in C.

Arguments:

The two arguments to the ‘xor’ instruction must be integer or vector of integer values. Both arguments must have identical types.

Semantics:

The truth table used for the ‘xor’ instruction is:

In0 In1 Out

0 0 0

0 1 1

1 0 1

1 1 0

Example:

<result> = xor i32 4, %var ; yields {i32}:result = 4 ^ %var

<result> = xor i32 15, 40 ; yields {i32}:result = 39

<result> = xor i32 4, 8 ; yields {i32}:result = 12

<result> = xor i32 %V, -1 ; yields {i32}:result = ~%V

Vector Operations
extractelement
Syntax:
  <result> = extractelement <n x <ty>> <val>, i32 <idx>    ; yields <ty>
Overview:

The ‘extractelement‘ instruction extracts a single scalar element from a vector at a specified index.

Arguments:

The first operand of an ‘extractelement‘ instruction is a value of vector type. The second operand is an index indicating the position from which to extract the element. The index may be a variable.

Semantics:

The result is a scalar of the same type as the element type of val. Its value is the value at position idx of val. If idx exceeds the length of val, the results are undefined.

Example:
  <result> = extractelement <4 x i32> %vec, i32 0    ; yields i32
insertelement
Syntax:
  <result> = insertelement <n x <ty>> <val>, <ty> <elt>, i32 <idx>    ; yields <n x <ty>>
Overview:

The ‘insertelement‘ instruction inserts a scalar element into a vector at a specified index.

Arguments:

The first operand of an ‘insertelement‘ instruction is a value of vector type. The second operand is a scalar value whose type must equal the element type of the first operand. The third operand is an index indicating the position at which to insert the value. The index may be a variable.

Semantics:

The result is a vector of the same type as val. Its element values are those of val except at position idx, where it gets the value elt. If idx exceeds the length of val, the results are undefined.

Example:
  <result> = insertelement <4 x i32> %vec, i32 1, i32 0    ; yields <4 x i32>
shufflevector
Syntax:
  <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> <mask>    ; yields <m x <ty>>
Overview:

The ‘shufflevector‘ instruction constructs a permutation of elements from two input vectors, returning a vector with the same element type as the input and length that is the same as the shuffle mask.

Arguments:

The first two operands of a ‘shufflevector‘ instruction are vectors with types that match each other. The third argument is a shuffle mask whose element type is always ‘i32′. The result of the instruction is a vector whose length is the same as the shuffle mask and whose element type is the same as the element type of the first two operands.

The shuffle mask operand is required to be a constant vector with either constant integer or undef values.

Semantics:

The elements of the two input vectors are numbered from left to right across both of the vectors. The shuffle mask operand specifies, for each element of the result vector, which element of the two input vectors the result element gets. The element selector may be undef (meaning “don’t care”) and the second operand may be undef if performing a shuffle from only one vector.

Example:
  <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
                          <4 x i32> <i32 0, i32 4, i32 1, i32 5>  ; yields <4 x i32>
  <result> = shufflevector <4 x i32> %v1, <4 x i32> undef,
                          <4 x i32> <i32 0, i32 1, i32 2, i32 3>  ; yields <4 x i32> - Identity shuffle.
  <result> = shufflevector <8 x i32> %v1, <8 x i32> undef,
                          <4 x i32> <i32 0, i32 1, i32 2, i32 3>  ; yields <4 x i32>
  <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
                          <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7 >  ; yields <8 x i32>
Aggregate Operations
extractvalue
Syntax:
  <result> = extractvalue <aggregate type> <val>, <idx>{, <idx>}*
Overview:

The ‘extractvalue‘ instruction extracts the value of a member field from an aggregate value.

Arguments:

The first operand of an ‘extractvalue‘ instruction is a value of struct or array type. The operands are constant indices to specify which value to extract in a similar manner as indices in a ‘getelementptr‘ instruction.

The major differences to getelementptr indexing are:

  • Since the value being indexed is not a pointer, the first index is omitted and assumed to be zero.
  • At least one index must be specified.
  • Not only struct indices but also array indices must be in bounds.
Semantics:

The result is the value at the position in the aggregate specified by the index operands.

Example:
  <result> = extractvalue {i32, float} %agg, 0    ; yields i32
insertvalue
Syntax:
  <result> = insertvalue <aggregate type> <val>, <ty> <elt>, <idx>{, <idx>}*    ; yields <aggregate type>
Overview:

The ‘insertvalue‘ instruction inserts a value into a member field in an aggregate value.

Arguments:

The first operand of an ‘insertvalue‘ instruction is a value of struct or array type. The second operand is a first-class value to insert. The following operands are constant indices indicating the position at which to insert the value in a similar manner as indices in a ‘extractvalue‘ instruction. The value to insert must have the same type as the value identified by the indices.

Semantics:

The result is an aggregate of the same type as val. Its value is that of val except that the value at the position specified by the indices is that of elt.

Example:
  %agg1 = insertvalue {i32, float} undef, i32 1, 0              ; yields {i32 1, float undef}
  %agg2 = insertvalue {i32, float} %agg1, float %val, 1         ; yields {i32 1, float %val}
  %agg3 = insertvalue {i32, {float}} %agg1, float %val, 1, 0    ; yields {i32 1, float %val}
Memory Access and Addressing Operations
alloca
Syntax:
  <result> = alloca <type>[, <ty> <NumElements>][, align <alignment>]     ; yields {type*}:result
Overview:

The ‘alloca‘ instruction allocates memory on the stack frame of the currently executing function, to be automatically released when this function returns to its caller. The object is always allocated in the generic address space (address space zero).

Arguments:

The ‘alloca‘ instruction allocates sizeof(<type>)*NumElements bytes of memory on the runtime stack, returning a pointer of the appropriate type to the program. If “NumElements” is specified, it is the number of elements allocated, otherwise “NumElements” is defaulted to be one. If a constant alignment is specified, the value result of the allocation is guaranteed to be aligned to at least that boundary. If not specified, or if zero, the target can choose to align the allocation on any convenient boundary compatible with the type.

type‘ may be any sized type.

Semantics:

Memory is allocated; a pointer is returned. The operation is undefined if there is insufficient stack space for the allocation. ‘alloca‘d memory is automatically released when the function returns. The ‘alloca‘ instruction is commonly used to represent automatic variables that must have an address available. When the function returns (either with the ret or resume instructions), the memory is reclaimed. Allocating zero bytes is legal, but the result is undefined. The order in which memory is allocated (ie., which way the stack grows) is not specified.

Example:
  %ptr = alloca i32                             ; yields {i32*}:ptr
  %ptr = alloca i32, i32 4                      ; yields {i32*}:ptr
  %ptr = alloca i32, i32 4, align 1024          ; yields {i32*}:ptr
  %ptr = alloca i32, align 1024                 ; yields {i32*}:ptr
load
Syntax:
  <result> = load [volatile] <ty>* <pointer>[, align <alignment>][, !nontemporal !<index>][, !invariant.load !<index>]
  <result> = load atomic [volatile] <ty>* <pointer> [singlethread] <ordering>, align <alignment>
  !<index> = !{ i32 1 }
Overview:

The ‘load‘ instruction is used to read from memory.

Arguments:

The argument to the ‘load‘ instruction specifies the memory address from which to load. The pointer must point to a first class type. If the load is marked as volatile, then the optimizer is not allowed to modify the number or order of execution of this load with other volatile operations.

If the load is marked as atomic, it takes an extra ordering and optional singlethread argument. The release and acq_rel orderings are not valid on load instructions. Atomic loads produce defined results when they may see multiple atomic stores. The type of the pointee must be an integer type whose bit width is a power of two greater than or equal to eight and less than or equal to a target-specific size limit. align must be explicitly specified on atomic loads, and the load has undefined behavior if the alignment is not set to a value which is at least the size in bytes of the pointee. !nontemporal does not have any defined semantics for atomic loads.

The optional constant align argument specifies the alignment of the operation (that is, the alignment of the memory address). A value of 0 or an omitted align argument means that the operation has the preferential alignment for the target. It is the responsibility of the code emitter to ensure that the alignment information is correct. Overestimating the alignment results in undefined behavior. Underestimating the alignment may produce less efficient code. An alignment of 1 is always safe.

The optional !nontemporal metadata must reference a single metatadata name <index> corresponding to a metadata node with one i32 entry of value 1. The existence of the !nontemporal metatadata on the instruction tells the optimizer and code generator that this load is not expected to be reused in the cache. The code generator may select special instructions to save cache bandwidth, such as the MOVNT instruction on x86.

The optional !invariant.load metadata must reference a single metatadata name <index> corresponding to a metadata node with no entries. The existence of the !invariant.load metatadata on the instruction tells the optimizer and code generator that this load address points to memory which does not change value during program execution. The optimizer may then move this load around, for example, by hoisting it out of loops using loop invariant code motion.

Semantics:

The location of memory pointed to is loaded. If the value being loaded is of scalar type then the number of bytes read does not exceed the minimum number of bytes needed to hold all bits of the type. For example, loading an i24 reads at most three bytes. When loading a value of a type like i20 with a size that is not an integral number of bytes, the result is undefined if the value was not originally written using a store of the same type.

Examples:
  %ptr = alloca i32                               ; yields {i32*}:ptr
  store i32 3, i32* %ptr                          ; yields {void}
  %val = load i32* %ptr                           ; yields {i32}:val = i32 3
store
Syntax:
  store [volatile] <ty> <value>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<index>]        ; yields {void}
  store atomic [volatile] <ty> <value>, <ty>* <pointer> [singlethread] <ordering>, align <alignment>  ; yields {void}
Overview:

The ‘store‘ instruction is used to write to memory.

Arguments:

There are two arguments to the ‘store‘ instruction: a value to store and an address at which to store it. The type of the ‘<pointer>‘ operand must be a pointer to the first class type of the ‘<value>‘ operand. If the store is marked asvolatile, then the optimizer is not allowed to modify the number or order of execution of this store with other volatile operations.

If the store is marked as atomic, it takes an extra ordering and optional singlethread argument. The acquire and acq_rel orderings aren’t valid on store instructions. Atomic loads produce defined results when they may see multiple atomic stores. The type of the pointee must be an integer type whose bit width is a power of two greater than or equal to eight and less than or equal to a target-specific size limit. align must be explicitly specified on atomic stores, and the store has undefined behavior if the alignment is not set to a value which is at least the size in bytes of the pointee. !nontemporal does not have any defined semantics for atomic stores.

The optional constant “align” argument specifies the alignment of the operation (that is, the alignment of the memory address). A value of 0 or an omitted “align” argument means that the operation has the preferential alignment for the target. It is the responsibility of the code emitter to ensure that the alignment information is correct. Overestimating the alignment results in an undefined behavior. Underestimating the alignment may produce less efficient code. An alignment of 1 is always safe.

The optional !nontemporal metadata must reference a single metatadata name <index> corresponding to a metadata node with one i32 entry of value 1. The existence of the !nontemporal metatadata on the instruction tells the optimizer and code generator that this load is not expected to be reused in the cache. The code generator may select special instructions to save cache bandwidth, such as the MOVNT instruction on x86.

Semantics:

The contents of memory are updated to contain ‘<value>‘ at the location specified by the ‘<pointer>‘ operand. If ‘<value>‘ is of scalar type then the number of bytes written does not exceed the minimum number of bytes needed to hold all bits of the type. For example, storing an i24 writes at most three bytes. When writing a value of a type like i20 with a size that is not an integral number of bytes, it is unspecified what happens to the extra bits that do not belong to the type, but they will typically be overwritten.

Example:
  %ptr = alloca i32                               ; yields {i32*}:ptr
  store i32 3, i32* %ptr                          ; yields {void}
  %val = load i32* %ptr                           ; yields {i32}:val = i32 3
fence
Syntax:
  fence [singlethread] <ordering>                   ; yields {void}
Overview:

The ‘fence‘ instruction is used to introduce happens-before edges between operations.

Arguments:

fence‘ instructions take an ordering argument which defines what synchronizes-with edges they add. They can only be given acquire, release, acq_rel, and seq_cst orderings.

Semantics:

A fence A which has (at least) release ordering semantics synchronizes with a fence B with (at least) acquire ordering semantics if and only if there exist atomic operations X and Y, both operating on some atomic object M, such that A is sequenced before X, X modifies M (either directly or through some side effect of a sequence headed by X), Y is sequenced before B, and Y observes M. This provides a happens-before dependency between A and B. Rather than an explicitfence, one (but not both) of the atomic operations X or Y might provide a release or acquire (resp.) ordering constraint and still synchronize-with the explicit fence and establish the happens-before edge.

A fence which has seq_cst ordering, in addition to having both acquire and release semantics specified above, participates in the global program order of other seq_cst operations and/or fences.

The optional “singlethread” argument specifies that the fence only synchronizes with other fences in the same thread. (This is useful for interacting with signal handlers.)

Example:
  fence acquire                          ; yields {void}
  fence singlethread seq_cst             ; yields {void}
cmpxchg
Syntax:
  cmpxchg [volatile] <ty>* <pointer>, <ty> <cmp>, <ty> <new> [singlethread] <ordering>  ; yields {ty}
Overview:

The ‘cmpxchg‘ instruction is used to atomically modify memory. It loads a value in memory and compares it to a given value. If they are equal, it stores a new value into the memory.

Arguments:

There are three arguments to the ‘cmpxchg‘ instruction: an address to operate on, a value to compare to the value currently be at that address, and a new value to place at that address if the compared values are equal. The type of ‘<cmp>‘ must be an integer type whose bit width is a power of two greater than or equal to eight and less than or equal to a target-specific size limit. ‘<cmp>‘ and ‘<new>‘ must have the same type, and the type of ‘<pointer>‘ must be a pointer to that type. If the cmpxchg is marked as volatile, then the optimizer is not allowed to modify the number or order of execution of this cmpxchg with other volatile operations.

The ordering argument specifies how this cmpxchg synchronizes with other atomic operations.

The optional “singlethread” argument declares that the cmpxchg is only atomic with respect to code (usually signal handlers) running in the same thread as the cmpxchg. Otherwise the cmpxchg is atomic with respect to all other code in the system.

The pointer passed into cmpxchg must have alignment greater than or equal to the size in memory of the operand.

Semantics:

The contents of memory at the location specified by the ‘<pointer>‘ operand is read and compared to ‘<cmp>‘; if the read value is the equal, ‘<new>‘ is written. The original value at the location is returned.

A successful cmpxchg is a read-modify-write instruction for the purpose of identifying release sequences. A failed cmpxchg is equivalent to an atomic load with an ordering parameter determined by dropping any release part of thecmpxchg‘s ordering.

Example:
entry:
  %orig = atomic load i32* %ptr unordered                   ; yields {i32}
  br label %loop

loop:
  %cmp = phi i32 [ %orig, %entry ], [%old, %loop]
  %squared = mul i32 %cmp, %cmp
  %old = cmpxchg i32* %ptr, i32 %cmp, i32 %squared          ; yields {i32}
  %success = icmp eq i32 %cmp, %old
  br i1 %success, label %done, label %loop

done:
atomicrmw
Syntax:
  atomicrmw [volatile] <operation> <ty>* <pointer>, <ty> <value> [singlethread] <ordering>                   ; yields {ty}
Overview:

The ‘atomicrmw‘ instruction is used to atomically modify memory.

Arguments:

There are three arguments to the ‘atomicrmw‘ instruction: an operation to apply, an address whose value to modify, an argument to the operation. The operation must be one of the following keywords:

  • xchg
  • add
  • sub
  • and
  • nand
  • or
  • xor
  • max
  • min
  • umax
  • umin

The type of ‘<value>‘ must be an integer type whose bit width is a power of two greater than or equal to eight and less than or equal to a target-specific size limit. The type of the ‘<pointer>‘ operand must be a pointer to that type. If theatomicrmw is marked as volatile, then the optimizer is not allowed to modify the number or order of execution of this atomicrmw with other volatile operations.

Semantics:

The contents of memory at the location specified by the ‘<pointer>‘ operand are atomically read, modified, and written back. The original value at the location is returned. The modification is specified by the operation argument:

  • xchg: *ptr = val
  • add: *ptr = *ptr + val
  • sub: *ptr = *ptr - val
  • and: *ptr = *ptr & val
  • nand: *ptr = ~(*ptr & val)
  • or: *ptr = *ptr | val
  • xor: *ptr = *ptr ^ val
  • max: *ptr = *ptr > val ? *ptr : val (using a signed comparison)
  • min: *ptr = *ptr < val ? *ptr : val (using a signed comparison)
  • umax: *ptr = *ptr > val ? *ptr : val (using an unsigned comparison)
  • umin: *ptr = *ptr < val ? *ptr : val (using an unsigned comparison)
Example:
  %old = atomicrmw add i32* %ptr, i32 1 acquire                        ; yields {i32}
getelementptr
Syntax:
  <result> = getelementptr <pty>* <ptrval>{, <ty> <idx>}*
  <result> = getelementptr inbounds <pty>* <ptrval>{, <ty> <idx>}*
  <result> = getelementptr <ptr vector> ptrval, <vector index type> idx 
Overview:

The ‘getelementptr‘ instruction is used to get the address of a subelement of an aggregate data structure. It performs address calculation only and does not access memory.

Arguments:

The first argument is always a pointer or a vector of pointers, and forms the basis of the calculation. The remaining arguments are indices that indicate which of the elements of the aggregate object are indexed. The interpretation of each index is dependent on the type being indexed into. The first index always indexes the pointer value given as the first argument, the second index indexes a value of the type pointed to (not necessarily the value directly pointed to, since the first index can be non-zero), etc. The first type indexed into must be a pointer value, subsequent types can be arrays, vectors, and structs. Note that subsequent types being indexed into can never be pointers, since that would require loading the pointer before continuing calculation.

The type of each index argument depends on the type it is indexing into. When indexing into a (optionally packed) structure, only i32 integer constants are allowed. When indexing into an array, pointer or vector, integers of any width are allowed, and they are not required to be constant. These integers are treated as signed values where relevant.

For example, let’s consider a C code fragment and how it gets compiled to LLVM:

struct RT {
  char A;
  int B[10][20];
  char C;
};
struct ST {
  int X;
  double Y;
  struct RT Z;
};

int *foo(struct ST *s) {
  return &s[1].Z.B[5][13];
}

The LLVM code generated by Clang is:

%struct.RT = type { i8, [10 x [20 x i32]], i8 }
%struct.ST = type { i32, double, %struct.RT }

define i32* @foo(%struct.ST* %s) nounwind uwtable readnone optsize ssp {
entry:
  %arrayidx = getelementptr inbounds %struct.ST* %s, i64 1, i32 2, i32 1, i64 5, i64 13
  ret i32* %arrayidx
}
Semantics:

In the example above, the first index is indexing into the ‘%struct.ST*‘ type, which is a pointer, yielding a ‘%struct.ST‘ = ‘{ i32, double, %struct.RT }‘ type, a structure. The second index indexes into the third element of the structure, yielding a ‘%struct.RT‘ = ‘{ i8 , [10 x [20 x i32]], i8 }‘ type, another structure. The third index indexes into the second element of the structure, yielding a ‘[10 x [20 x i32]]‘ type, an array. The two dimensions of the array are subscripted into, yielding an ‘i32‘ type. The ‘getelementptr‘ instruction returns a pointer to this element, thus computing a value of ‘i32*‘ type.

Note that it is perfectly legal to index partially through a structure, returning a pointer to an inner element. Because of this, the LLVM code for the given testcase is equivalent to:

define i32* @foo(%struct.ST* %s) {
  %t1 = getelementptr %struct.ST* %s, i32 1                 ; yields %struct.ST*:%t1
  %t2 = getelementptr %struct.ST* %t1, i32 0, i32 2         ; yields %struct.RT*:%t2
  %t3 = getelementptr %struct.RT* %t2, i32 0, i32 1         ; yields [10 x [20 x i32]]*:%t3
  %t4 = getelementptr [10 x [20 x i32]]* %t3, i32 0, i32 5  ; yields [20 x i32]*:%t4
  %t5 = getelementptr [20 x i32]* %t4, i32 0, i32 13        ; yields i32*:%t5
  ret i32* %t5
}

If the inbounds keyword is present, the result value of the getelementptr is a poison value if the base pointer is not an in bounds address of an allocated object, or if any of the addresses that would be formed by successive addition of the offsets implied by the indices to the base address with infinitely precise signed arithmetic are not an in bounds address of that allocated object. The in bounds addresses for an allocated object are all the addresses that point into the object, plus the address one byte past the end. In cases where the base is a vector of pointers the inbounds keyword applies to each of the computations element-wise.

If the inbounds keyword is not present, the offsets are added to the base address with silently-wrapping two’s complement arithmetic. If the offsets have a different width from the pointer, they are sign-extended or truncated to the width of the pointer. The result value of the getelementptr may be outside the object pointed to by the base pointer. The result value may not necessarily be used to access memory though, even if it happens to point into allocated storage. See thePointer Aliasing Rules section for more information.

The getelementptr instruction is often confusing. For some more insight into how it works, see the getelementptr FAQ.

Example:
    ; yields [12 x i8]*:aptr
    %aptr = getelementptr {i32, [12 x i8]}* %saptr, i64 0, i32 1
    ; yields i8*:vptr
    %vptr = getelementptr {i32, <2 x i8>}* %svptr, i64 0, i32 1, i32 1
    ; yields i8*:eptr
    %eptr = getelementptr [12 x i8]* %aptr, i64 0, i32 1
    ; yields i32*:iptr
    %iptr = getelementptr [10 x i32]* @arr, i16 0, i16 0

In cases where the pointer argument is a vector of pointers, only a single index may be used, and the number of vector elements has to be the same. For example:

 %A = getelementptr <4 x i8*> %ptrs, <4 x i64> %offsets,
Conversion Operations
trunc .. to
Syntax:
  <result> = trunc <ty> <value> to <ty2>             ; yields ty2
Overview:

The ‘trunc‘ instruction truncates its operand to the type ty2.

Arguments:

The ‘trunc‘ instruction takes a value to trunc, and a type to trunc it to. Both types must be of integer types, or vectors of the same number of integers. The bit size of the value must be larger than the bit size of the destination type, ty2. Equal sized types are not allowed.

Semantics:

The ‘trunc‘ instruction truncates the high order bits in value and converts the remaining bits to ty2. Since the source size must be larger than the destination size, trunc cannot be a no-op cast. It will always truncate bits.

Example:
  %X = trunc i32 257 to i8                        ; yields i8:1
  %Y = trunc i32 123 to i1                        ; yields i1:true
  %Z = trunc i32 122 to i1                        ; yields i1:false
  %W = trunc <2 x i16> <i16 8, i16 7> to <2 x i8> ; yields <i8 8, i8 7>
zext .. to
Syntax:
  <result> = zext <ty> <value> to <ty2>             ; yields ty2
Overview:

The ‘zext‘ instruction zero extends its operand to type ty2.

Arguments:

The ‘zext‘ instruction takes a value to cast, and a type to cast it to. Both types must be of integer types, or vectors of the same number of integers. The bit size of the value must be smaller than the bit size of the destination type, ty2.

Semantics:

The zext fills the high order bits of the value with zero bits until it reaches the size of the destination type, ty2.

When zero extending from i1, the result will always be either 0 or 1.

Example:
  %X = zext i32 257 to i64              ; yields i64:257
  %Y = zext i1 true to i32              ; yields i32:1
  %Z = zext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
sext .. to
Syntax:
  <result> = sext <ty> <value> to <ty2>             ; yields ty2
Overview:

The ‘sext‘ sign extends value to the type ty2.

Arguments:

The ‘sext‘ instruction takes a value to cast, and a type to cast it to. Both types must be of integer types, or vectors of the same number of integers. The bit size of the value must be smaller than the bit size of the destination type, ty2.

Semantics:

The ‘sext‘ instruction performs a sign extension by copying the sign bit (highest order bit) of the value until it reaches the bit size of the type ty2.

When sign extending from i1, the extension always results in -1 or 0.

Example:
  %X = sext i8  -1 to i16              ; yields i16   :65535
  %Y = sext i1 true to i32             ; yields i32:-1
  %Z = sext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
fptrunc .. to
Syntax:
  <result> = fptrunc <ty> <value> to <ty2>             ; yields ty2
Overview:

The ‘fptrunc‘ instruction truncates value to type ty2.

Arguments:

The ‘fptrunc‘ instruction takes a floating point value to cast and a floating point type to cast it to. The size of value must be larger than the size of ty2. This implies that fptrunc cannot be used to make a no-op cast.

Semantics:

The ‘fptrunc‘ instruction truncates a value from a larger floating point type to a smaller floating point type. If the value cannot fit within the destination type, ty2, then the results are undefined.

Example:
  %X = fptrunc double 123.0 to float         ; yields float:123.0
  %Y = fptrunc double 1.0E+300 to float      ; yields undefined
fpext .. to
Syntax:
  <result> = fpext <ty> <value> to <ty2>             ; yields ty2
Overview:

The ‘fpext‘ extends a floating point value to a larger floating point value.

Arguments:

The ‘fpext‘ instruction takes a floating point value to cast, and a floating point type to cast it to. The source type must be smaller than the destination type.

Semantics:

The ‘fpext‘ instruction extends the value from a smaller floating point type to a larger floating point type. The fpext cannot be used to make a no-op cast because it always changes bits. Use bitcast to make a no-op cast for a floating point cast.

Example:
  %X = fpext float 3.125 to double         ; yields double:3.125000e+00
  %Y = fpext double %X to fp128            ; yields fp128:0xL00000000000000004000900000000000
fptoui .. to
Syntax:
  <result> = fptoui <ty> <value> to <ty2>             ; yields ty2
Overview:

The ‘fptoui‘ converts a floating point value to its unsigned integer equivalent of type ty2.

Arguments:

The ‘fptoui‘ instruction takes a value to cast, which must be a scalar or vector floating point value, and a type to cast it to ty2, which must be an integer type. If ty is a vector floating point type, ty2 must be a vector integer type with the same number of elements as ty

Semantics:

The ‘fptoui‘ instruction converts its floating point operand into the nearest (rounding towards zero) unsigned integer value. If the value cannot fit in ty2, the results are undefined.

Example:
  %X = fptoui double 123.0 to i32      ; yields i32:123
  %Y = fptoui float 1.0E+300 to i1     ; yields undefined:1
  %Z = fptoui float 1.04E+17 to i8     ; yields undefined:1
fptosi .. to
Syntax:
  <result> = fptosi <ty> <value> to <ty2>             ; yields ty2
Overview:

The ‘fptosi‘ instruction converts floating point value to type ty2.

Arguments:

The ‘fptosi‘ instruction takes a value to cast, which must be a scalar or vector floating point value, and a type to cast it to ty2, which must be an integer type. If ty is a vector floating point type, ty2 must be a vector integer type with the same number of elements as ty

Semantics:

The ‘fptosi‘ instruction converts its floating point operand into the nearest (rounding towards zero) signed integer value. If the value cannot fit in ty2, the results are undefined.

Example:
  %X = fptosi double -123.0 to i32      ; yields i32:-123
  %Y = fptosi float 1.0E-247 to i1      ; yields undefined:1
  %Z = fptosi float 1.04E+17 to i8      ; yields undefined:1
uitofp .. to
Syntax:
  <result> = uitofp <ty> <value> to <ty2>             ; yields ty2
Overview:

The ‘uitofp‘ instruction regards value as an unsigned integer and converts that value to the ty2 type.

Arguments:

The ‘uitofp‘ instruction takes a value to cast, which must be a scalar or vector integer value, and a type to cast it to ty2, which must be an floating point type. If ty is a vector integer type, ty2 must be a vector floating point type with the same number of elements as ty

Semantics:

The ‘uitofp‘ instruction interprets its operand as an unsigned integer quantity and converts it to the corresponding floating point value. If the value cannot fit in the floating point value, the results are undefined.

Example:
  %X = uitofp i32 257 to float         ; yields float:257.0
  %Y = uitofp i8 -1 to double          ; yields double:255.0
sitofp .. to
Syntax:
  <result> = sitofp <ty> <value> to <ty2>             ; yields ty2
Overview:

The ‘sitofp‘ instruction regards value as a signed integer and converts that value to the ty2 type.

Arguments:

The ‘sitofp‘ instruction takes a value to cast, which must be a scalar or vector integer value, and a type to cast it to ty2, which must be an floating point type. If ty is a vector integer type, ty2 must be a vector floating point type with the same number of elements as ty

Semantics:

The ‘sitofp‘ instruction interprets its operand as a signed integer quantity and converts it to the corresponding floating point value. If the value cannot fit in the floating point value, the results are undefined.

Example:
  %X = sitofp i32 257 to float         ; yields float:257.0
  %Y = sitofp i8 -1 to double          ; yields double:-1.0
ptrtoint .. to
Syntax:
  <result> = ptrtoint <ty> <value> to <ty2>             ; yields ty2
Overview:

The ‘ptrtoint‘ instruction converts the pointer or a vector of pointers value to the integer (or vector of integers) type ty2.

Arguments:

The ‘ptrtoint‘ instruction takes a value to cast, which must be a a value of type pointer or a vector of pointers, and a type to cast it to ty2, which must be an integer or a vector of integers type.

Semantics:

The ‘ptrtoint‘ instruction converts value to integer type ty2 by interpreting the pointer value as an integer and either truncating or zero extending that value to the size of the integer type. If value is smaller than ty2 then a zero extension is done. If value is larger than ty2 then a truncation is done. If they are the same size, then nothing is done (no-op cast) other than a type change.

Example:
  %X = ptrtoint i32* %P to i8                         ; yields truncation on 32-bit architecture
  %Y = ptrtoint i32* %P to i64                        ; yields zero extension on 32-bit architecture
  %Z = ptrtoint <4 x i32*> %P to <4 x i64>; yields vector zero extension for a vector of addresses on 32-bit architecture
inttoptr .. to
Syntax:
  <result> = inttoptr <ty> <value> to <ty2>             ; yields ty2
Overview:

The ‘inttoptr‘ instruction converts an integer value to a pointer type, ty2.

Arguments:

The ‘inttoptr‘ instruction takes an integer value to cast, and a type to cast it to, which must be a pointer type.

Semantics:

The ‘inttoptr‘ instruction converts value to type ty2 by applying either a zero extension or a truncation depending on the size of the integer value. If value is larger than the size of a pointer then a truncation is done. If value is smaller than the size of a pointer then a zero extension is done. If they are the same size, nothing is done (no-op cast).

Example:
  %X = inttoptr i32 255 to i32*          ; yields zero extension on 64-bit architecture
  %Y = inttoptr i32 255 to i32*          ; yields no-op on 32-bit architecture
  %Z = inttoptr i64 0 to i32*            ; yields truncation on 32-bit architecture
  %Z = inttoptr <4 x i32> %G to <4 x i8*>; yields truncation of vector G to four pointers
bitcast .. to
Syntax:
  <result> = bitcast <ty> <value> to <ty2>             ; yields ty2
Overview:

The ‘bitcast‘ instruction converts value to type ty2 without changing any bits.

Arguments:

The ‘bitcast‘ instruction takes a value to cast, which must be a non-aggregate first class value, and a type to cast it to, which must also be a non-aggregate first class type. The bit sizes of value and the destination type, ty2, must be identical. If the source type is a pointer, the destination type must also be a pointer. This instruction supports bitwise conversion of vectors to integers and to vectors of other types (as long as they have the same size).

Semantics:

The ‘bitcast‘ instruction converts value to type ty2. It is always a no-op cast because no bits change with this conversion. The conversion is done as if the value had been stored to memory and read back as type ty2. Pointer (or vector of pointers) types may only be converted to other pointer (or vector of pointers) types with this instruction. To convert pointers to other types, use the inttoptr or ptrtoint instructions first.

Example:
  %X = bitcast i8 255 to i8              ; yields i8 :-1
  %Y = bitcast i32* %x to sint*          ; yields sint*:%x
  %Z = bitcast <2 x int> %V to i64;        ; yields i64: %V
  %Z = bitcast <2 x i32*> %V to <2 x i64*> ; yields <2 x i64*>
Other Operations
icmp
Syntax:
  <result> = icmp <cond> <ty> <op1>, <op2>   ; yields {i1} or {<N x i1>}:result
Overview:

The ‘icmp‘ instruction returns a boolean value or a vector of boolean values based on comparison of its two integer, integer vector, pointer, or pointer vector operands.

Arguments:

The ‘icmp‘ instruction takes three operands. The first operand is the condition code indicating the kind of comparison to perform. It is not a value, just a keyword. The possible condition code are:

  1. eq: equal
  2. ne: not equal
  3. ugt: unsigned greater than
  4. uge: unsigned greater or equal
  5. ult: unsigned less than
  6. ule: unsigned less or equal
  7. sgt: signed greater than
  8. sge: signed greater or equal
  9. slt: signed less than
  10. sle: signed less or equal

The remaining two arguments must be integer or pointer or integer vector typed. They must also be identical types.

Semantics:

The ‘icmp‘ compares op1 and op2 according to the condition code given as cond. The comparison performed always yields either an i1 or vector of i1 result, as follows:

  1. eq: yields true if the operands are equal, false otherwise. No sign interpretation is necessary or performed.
  2. ne: yields true if the operands are unequal, false otherwise. No sign interpretation is necessary or performed.
  3. ugt: interprets the operands as unsigned values and yields true if op1 is greater than op2.
  4. uge: interprets the operands as unsigned values and yields true if op1 is greater than or equal to op2.
  5. ult: interprets the operands as unsigned values and yields true if op1 is less than op2.
  6. ule: interprets the operands as unsigned values and yields true if op1 is less than or equal to op2.
  7. sgt: interprets the operands as signed values and yields true if op1 is greater than op2.
  8. sge: interprets the operands as signed values and yields true if op1 is greater than or equal to op2.
  9. slt: interprets the operands as signed values and yields true if op1 is less than op2.
  10. sle: interprets the operands as signed values and yields true if op1 is less than or equal to op2.

If the operands are pointer typed, the pointer values are compared as if they were integers.

If the operands are integer vectors, then they are compared element by element. The result is an i1 vector with the same number of elements as the values being compared. Otherwise, the result is an i1.

Example:
  <result> = icmp eq i32 4, 5          ; yields: result=false
  <result> = icmp ne float* %X, %X     ; yields: result=false
  <result> = icmp ult i16  4, 5        ; yields: result=true
  <result> = icmp sgt i16  4, 5        ; yields: result=false
  <result> = icmp ule i16 -4, 5        ; yields: result=false
  <result> = icmp sge i16  4, 5        ; yields: result=false

Note that the code generator does not yet support vector types with the icmp instruction.

fcmp
Syntax:
  <result> = fcmp <cond> <ty> <op1>, <op2>     ; yields {i1} or {<N x i1>}:result
Overview:

The ‘fcmp‘ instruction returns a boolean value or vector of boolean values based on comparison of its operands.

If the operands are floating point scalars, then the result type is a boolean (i1).

If the operands are floating point vectors, then the result type is a vector of boolean with the same number of elements as the operands being compared.

Arguments:

The ‘fcmp‘ instruction takes three operands. The first operand is the condition code indicating the kind of comparison to perform. It is not a value, just a keyword. The possible condition code are:

  1. false: no comparison, always returns false
  2. oeq: ordered and equal
  3. ogt: ordered and greater than
  4. oge: ordered and greater than or equal
  5. olt: ordered and less than
  6. ole: ordered and less than or equal
  7. one: ordered and not equal
  8. ord: ordered (no nans)
  9. ueq: unordered or equal
  10. ugt: unordered or greater than
  11. uge: unordered or greater than or equal
  12. ult: unordered or less than
  13. ule: unordered or less than or equal
  14. une: unordered or not equal
  15. uno: unordered (either nans)
  16. true: no comparison, always returns true

Ordered means that neither operand is a QNAN while unordered means that either operand may be a QNAN.

Each of val1 and val2 arguments must be either a floating point type or a vector of floating point type. They must have identical types.

Semantics:

The ‘fcmp‘ instruction compares op1 and op2 according to the condition code given as cond. If the operands are vectors, then the vectors are compared element by element. Each comparison performed always yields an i1 result, as follows:

  1. false: always yields false, regardless of operands.
  2. oeq: yields true if both operands are not a QNAN and op1 is equal to op2.
  3. ogt: yields true if both operands are not a QNAN and op1 is greater than op2.
  4. oge: yields true if both operands are not a QNAN and op1 is greater than or equal to op2.
  5. olt: yields true if both operands are not a QNAN and op1 is less than op2.
  6. ole: yields true if both operands are not a QNAN and op1 is less than or equal to op2.
  7. one: yields true if both operands are not a QNAN and op1 is not equal to op2.
  8. ord: yields true if both operands are not a QNAN.
  9. ueq: yields true if either operand is a QNAN or op1 is equal to op2.
  10. ugt: yields true if either operand is a QNAN or op1 is greater than op2.
  11. uge: yields true if either operand is a QNAN or op1 is greater than or equal to op2.
  12. ult: yields true if either operand is a QNAN or op1 is less than op2.
  13. ule: yields true if either operand is a QNAN or op1 is less than or equal to op2.
  14. une: yields true if either operand is a QNAN or op1 is not equal to op2.
  15. uno: yields true if either operand is a QNAN.
  16. true: always yields true, regardless of operands.
Example:
  <result> = fcmp oeq float 4.0, 5.0    ; yields: result=false
  <result> = fcmp one float 4.0, 5.0    ; yields: result=true
  <result> = fcmp olt float 4.0, 5.0    ; yields: result=true
  <result> = fcmp ueq double 1.0, 2.0   ; yields: result=false

Note that the code generator does not yet support vector types with the fcmp instruction.

phi
Syntax:
  <result> = phi <ty> [ <val0>, <label0>], ...
Overview:

The ‘phi‘ instruction is used to implement the φ node in the SSA graph representing the function.

Arguments:

The type of the incoming values is specified with the first type field. After this, the ‘phi‘ instruction takes a list of pairs as arguments, with one pair for each predecessor basic block of the current block. Only values of first class type may be used as the value arguments to the PHI node. Only labels may be used as the label arguments.

There must be no non-phi instructions between the start of a basic block and the PHI instructions: i.e. PHI instructions must be first in a basic block.

For the purposes of the SSA form, the use of each incoming value is deemed to occur on the edge from the corresponding predecessor block to the current block (but after any definition of an ‘invoke‘ instruction’s return value on the same edge).

Semantics:

At runtime, the ‘phi‘ instruction logically takes on the value specified by the pair corresponding to the predecessor basic block that executed just prior to the current block.

Example:
Loop:       ; Infinite loop that counts from 0 on up...
  %indvar = phi i32 [ 0, %LoopHeader ], [ %nextindvar, %Loop ]
  %nextindvar = add i32 %indvar, 1
  br label %Loop
select
Syntax:
  <result> = select selty <cond>, <ty> <val1>, <ty> <val2>             ; yields ty

  selty is either i1 or {<N x i1>}
Overview:

The ‘select‘ instruction is used to choose one value based on a condition, without branching.

Arguments:

The ‘select‘ instruction requires an ‘i1′ value or a vector of ‘i1′ values indicating the condition, and two values of the same first class type. If the val1/val2 are vectors and the condition is a scalar, then entire vectors are selected, not individual elements.

Semantics:

If the condition is an i1 and it evaluates to 1, the instruction returns the first value argument; otherwise, it returns the second value argument.

If the condition is a vector of i1, then the value arguments must be vectors of the same size, and the selection is done element by element.

Example:
  %X = select i1 true, i8 17, i8 42          ; yields i8:17
call
Syntax:
  <result> = [tail] call [cconv] [ret attrs] <ty> [<fnty>*] <fnptrval>(<function args>) [fn attrs]
Overview:

The ‘call‘ instruction represents a simple function call.

Arguments:

This instruction requires several arguments:

  1. The optional “tail” marker indicates that the callee function does not access any allocas or varargs in the caller. Note that calls may be marked “tail” even if they do not occur before a ret instruction. If the “tail” marker is present, the function call is eligible for tail call optimization, but might not in fact be optimized into a jump. The code generator may optimize calls marked “tail” with either 1) automatic sibling call optimization when the caller and callee have matching signatures, or 2) forced tail call optimization when the following extra requirements are met:
    • Caller and callee both have the calling convention fastcc.
    • The call is in tail position (ret immediately follows call and ret uses value of call or is void).
    • Option -tailcallopt is enabled, or llvm::GuaranteedTailCallOpt is true.
    • Platform specific constraints are met.
  2. The optional “cconv” marker indicates which calling convention the call should use. If none is specified, the call defaults to using C calling conventions. The calling convention of the call must match the calling convention of the target function, or else the behavior is undefined.
  3. The optional Parameter Attributes list for return values. Only ‘zeroext‘, ‘signext‘, and ‘inreg‘ attributes are valid here.
  4. ty‘: the type of the call instruction itself which is also the type of the return value. Functions that return no value are marked void.
  5. fnty‘: shall be the signature of the pointer to function value being invoked. The argument types must match the types implied by this signature. This type can be omitted if the function is not varargs and if the function type does not return a pointer to a function.
  6. fnptrval‘: An LLVM value containing a pointer to a function to be invoked. In most cases, this is a direct function invocation, but indirect calls are just as possible, calling an arbitrary pointer to function value.
  7. function args‘: argument list whose types match the function signature argument types and parameter attributes. All arguments must be of first class type. If the function signature indicates the function accepts a variable number of arguments, the extra arguments can be specified.
  8. The optional function attributes list. Only ‘noreturn‘, ‘nounwind‘, ‘readonly‘ and ‘readnone‘ attributes are valid here.
Semantics:

The ‘call‘ instruction is used to cause control flow to transfer to a specified function, with its incoming arguments bound to the specified values. Upon a ‘ret‘ instruction in the called function, control flow continues with the instruction after the function call, and the return value of the function is bound to the result argument.

Example:
  %retval = call i32 @test(i32 %argc)
  call i32 (i8*, ...)* @printf(i8* %msg, i32 12, i8 42)        ; yields i32
  %X = tail call i32 @foo()                                    ; yields i32
  %Y = tail call fastcc i32 @foo()  ; yields i32
  call void %foo(i8 97 signext)

  %struct.A = type { i32, i8 }
  %r = call %struct.A @foo()                        ; yields { 32, i8 }
  %gr = extractvalue %struct.A %r, 0                ; yields i32
  %gr1 = extractvalue %struct.A %r, 1               ; yields i8
  %Z = call void @foo() noreturn                    ; indicates that %foo never returns normally
  %ZZ = call zeroext i32 @bar()                     ; Return value is %zero extended

llvm treats calls to some functions with names and arguments that match the standard C99 library as being the C99 library functions, and may perform optimizations or generate code for them under that assumption. This is something we’d like to change in the future to provide better support for freestanding environments and non-C-based languages.

va_arg
Syntax:
  <resultval> = va_arg <va_list*> <arglist>, <argty>
Overview:

The ‘va_arg‘ instruction is used to access arguments passed through the “variable argument” area of a function call. It is used to implement the va_arg macro in C.

Arguments:

This instruction takes a va_list* value and the type of the argument. It returns a value of the specified argument type and increments the va_list to point to the next argument. The actual type of va_list is target specific.

Semantics:

The ‘va_arg‘ instruction loads an argument of the specified type from the specified va_list and causes the va_list to point to the next argument. For more information, see the variable argument handling Intrinsic Functions.

It is legal for this instruction to be called in a function which does not take a variable number of arguments, for example, the vfprintf function.

va_arg is an LLVM instruction instead of an intrinsic function because it takes a type as an argument.

landingpad
Syntax:
  <resultval> = landingpad <resultty> personality <type> <pers_fn> <clause>+
  <resultval> = landingpad <resultty> personality <type> <pers_fn> cleanup <clause>*

  <clause> := catch <type> <value>
  <clause> := filter <array constant type> <array constant>
Overview:

The ‘landingpad‘ instruction is used by LLVM’s exception handling system to specify that a basic block is a landing pad — one where the exception lands, and corresponds to the code found in the catch portion of a try/catch sequence. It defines values supplied by the personality function (pers_fn) upon re-entry to the function. The resultval has the type resultty.

Arguments:

This instruction takes a pers_fn value. This is the personality function associated with the unwinding mechanism. The optional cleanup flag indicates that the landing pad block is a cleanup.

A clause begins with the clause type — catch or filter — and contains the global variable representing the “type” that may be caught or filtered respectively. Unlike the catch clause, the filter clause takes an array constant as its argument. Use “[0 x i8**] undef” for a filter which cannot throw. The ‘landingpad‘ instruction must contain at least one clause or the cleanup flag.

Semantics:

The ‘landingpad‘ instruction defines the values which are set by the personality function (pers_fn) upon re-entry to the function, and therefore the “result type” of the landingpad instruction. As with calling conventions, how the personality function results are represented in LLVM IR is target specific.

The clauses are applied in order from top to bottom. If two landingpad instructions are merged together through inlining, the clauses from the calling function are appended to the list of clauses. When the call stack is being unwound due to an exception being thrown, the exception is compared against each clause in turn. If it doesn’t match any of the clauses, and the cleanup flag is not set, then unwinding continues further up the call stack.

The landingpad instruction has several restrictions:

  • A landing pad block is a basic block which is the unwind destination of an ‘invoke‘ instruction.
  • A landing pad block must have a ‘landingpad‘ instruction as its first non-PHI instruction.
  • There can be only one ‘landingpad‘ instruction within the landing pad block.
  • A basic block that is not a landing pad block may not include a ‘landingpad‘ instruction.
  • All ‘landingpad‘ instructions in a function must have the same personality function.
Example:
  ;; A landing pad which can catch an integer.
  %res = landingpad { i8*, i32 } personality i32 (...)* @__gxx_personality_v0
           catch i8** @_ZTIi
  ;; A landing pad that is a cleanup.
  %res = landingpad { i8*, i32 } personality i32 (...)* @__gxx_personality_v0
           cleanup
  ;; A landing pad which can catch an integer and can only throw a double.
  %res = landingpad { i8*, i32 } personality i32 (...)* @__gxx_personality_v0
           catch i8** @_ZTIi
           filter [1 x i8**] [@_ZTId]