Qualcomm Hexagon: Programmer's Reference Manual
Qualcomm Hexagon: Programmer's Reference Manual
80-N2040-53 Rev. AA
October 3, 2022
All Qualcomm products mentioned herein are products of Qualcomm Technologies, Inc. and/or its subsidiaries.
Qualcomm and Hexagon are trademarks or registered trademarks of Qualcomm Incorporated. Other product and brand names may
be trademarks or registered trademarks of their respective owners.
This technical data may be subject to U.S. and international export, re-export, or transfer (“export”) laws. Diversion contrary to U.S.
and international law is strictly prohibited.
© 2022 Qualcomm Technologies, Inc. and/or its subsidiaries. All rights reserved.
Contents
Figures ..............................................................................................................................14
Tables ................................................................................................................................15
1 Introduction ................................................................................................... 17
1.1 Hexagon V73 processor architecture ......................................................................................17
1.1.1 Memory .........................................................................................................................17
1.1.1.1 Cache memory ...............................................................................................17
1.1.1.2 Virtual memory..............................................................................................17
1.1.2 Registers ........................................................................................................................17
1.1.3 Instruction sequencer....................................................................................................18
1.1.4 Execution units ..............................................................................................................18
1.1.5 Load/store units.............................................................................................................18
1.2 Instruction set .........................................................................................................................19
1.2.1 Addressing modes .........................................................................................................19
1.2.2 Program flow .................................................................................................................20
1.2.3 Instruction pipeline.................................................................................................20
1.3 Technical assistance ................................................................................................................20
2 Registers ........................................................................................................ 21
2.1 General registers .....................................................................................................................21
2.2 Control registers ......................................................................................................................23
2.2.1 Program counter............................................................................................................26
2.2.2 Loop registers ................................................................................................................27
2.2.3 User status register........................................................................................................27
2.2.4 Modifier registers ..........................................................................................................30
2.2.5 Predicate registers .........................................................................................................30
2.2.6 Circular start registers....................................................................................................32
2.2.7 User general pointer register.........................................................................................32
2.2.8 Global pointer................................................................................................................32
2.2.9 Cycle count registers......................................................................................................33
2.2.10 Frame limit register......................................................................................................34
2.2.11 Frame key register .......................................................................................................34
2.2.12 Packet count registers..................................................................................................34
2.2.13 Qtimer registers ...........................................................................................................35
80-N2040-53 Rev. AA 2
Qualcomm Hexagon V73 Programmer’s Reference Manual
3 Instructions.................................................................................................... 36
3.1 Hexagon processor instruction syntax.....................................................................................36
3.1.1 Numeric operands .........................................................................................................37
3.1.2 Terminology ...................................................................................................................38
3.1.3 Register operands ..........................................................................................................39
3.2 Instruction classes ...................................................................................................................41
3.3 Instruction packets ..................................................................................................................42
3.3.1 Packet execution semantics...........................................................................................42
3.3.2 Sequencing semantics ...................................................................................................43
3.3.3 Resource constraints......................................................................................................43
3.3.4 Grouping constraints .....................................................................................................44
3.3.5 Dependency constraints ................................................................................................45
3.3.6 Ordering constraints ......................................................................................................45
3.3.7 Alignment constraints....................................................................................................46
3.4 Instruction intrinsics................................................................................................................46
3.5 Compound instructions...........................................................................................................47
3.6 Duplex instructions .................................................................................................................47
80-N2040-53 Rev. AA 3
Qualcomm Hexagon V73 Programmer’s Reference Manual
5 Memory........................................................................................................... 70
5.1 Memory model........................................................................................................................70
5.1.1 Address space ................................................................................................................70
5.1.2 Byte order ......................................................................................................................70
5.1.3 Alignment ......................................................................................................................71
5.2 Memory loads .........................................................................................................................72
5.3 Memory stores ........................................................................................................................73
5.4 Dual stores ..............................................................................................................................74
5.5 Slot 1 store with slot 0 load.....................................................................................................74
5.6 New-value stores.....................................................................................................................74
5.7 Mem-ops .................................................................................................................................75
5.8 Addressing modes ...................................................................................................................75
5.8.1 Absolute.........................................................................................................................76
5.8.2 Absolute-set...................................................................................................................76
5.8.3 Absolute with register offset .........................................................................................76
5.8.4 Global pointer relative ...................................................................................................76
5.8.5 Indirect...........................................................................................................................78
5.8.6 Indirect with offset ........................................................................................................78
5.8.7 Indirect with register offset ...........................................................................................78
5.8.8 Indirect with auto-increment immediate ......................................................................79
5.8.9 Indirect with auto-increment register ...........................................................................79
5.8.10 Circular with auto-increment immediate ....................................................................80
5.8.11 Circular with auto-increment register .........................................................................81
5.8.12 Bit-reversed with auto-increment register ..................................................................82
5.9 Conditional load/stores...........................................................................................................83
5.10 Cache memory ......................................................................................................................84
5.10.1 Uncached memory ......................................................................................................85
5.10.2 Tightly coupled memory..............................................................................................85
5.10.3 Cache maintenance operations ...................................................................................85
80-N2040-53 Rev. AA 4
Qualcomm Hexagon V73 Programmer’s Reference Manual
80-N2040-53 Rev. AA 5
Qualcomm Hexagon V73 Programmer’s Reference Manual
8.3.2 Calls..............................................................................................................................114
8.3.3 Returns ........................................................................................................................115
8.3.4 Extended branches ......................................................................................................116
8.3.5 Branches to and from packets .....................................................................................116
8.4 Speculative jumps .................................................................................................................117
8.5 Compare jumps .....................................................................................................................118
8.5.1 New-value compare jumps ..........................................................................................118
8.6 Register transfer jumps .........................................................................................................120
8.7 Dual jumps ............................................................................................................................121
8.8 Hint indirect jump target.......................................................................................................121
8.9 Pauses ...................................................................................................................................122
8.10 Exceptions ...........................................................................................................................122
80-N2040-53 Rev. AA 6
Qualcomm Hexagon V73 Programmer’s Reference Manual
80-N2040-53 Rev. AA 7
Qualcomm Hexagon V73 Programmer’s Reference Manual
80-N2040-53 Rev. AA 8
Qualcomm Hexagon V73 Programmer’s Reference Manual
80-N2040-53 Rev. AA 9
Qualcomm Hexagon V73 Programmer’s Reference Manual
80-N2040-53 Rev. AA 10
Qualcomm Hexagon V73 Programmer’s Reference Manual
Vector reduce complex multiply by scalar with round and pack ...................456
Vector reduce complex rotate .......................................................................458
11.10.4 XTYPE FP ..................................................................................................................461
Floating point addition ...................................................................................461
Classify floating point value ...........................................................................462
Compare floating point value.........................................................................464
Convert floating point value to other format.................................................466
Convert integer to floating point value ..........................................................467
Convert floating point value to integer ..........................................................469
Floating point extreme value assistance ........................................................472
Floating point fused multiply-add ..................................................................473
Floating point fused multiply-add with scaling ..............................................474
Floating point reciprocal square root approximation ....................................475
Floating point fused multiply-add for library routines ...................................476
Create floating-point constant .......................................................................478
Floating point maximum ................................................................................479
Floating point minimum.................................................................................480
Floating point multiply ...................................................................................481
Floating point reciprocal approximation........................................................482
Floating point subtraction ..............................................................................483
11.10.5 XTYPE MPY...............................................................................................................484
Vector multiply word by signed half (32 × 16) ...............................................488
Vector multiply word by unsigned half (32 × 16) ...........................................492
Multiply signed halfwords..............................................................................496
Multiply unsigned halfwords..........................................................................503
Polynomial multiply words.............................................................................508
Vector reduce multiply word by signed half (32 × 16) ...................................510
Multiply and use upper result ........................................................................512
Multiply and use full result.............................................................................515
Vector dual multiply .......................................................................................517
Vector dual multiply with round and pack .....................................................520
Vector reduce multiply bytes .........................................................................522
Vector dual multiply signed by unsigned bytes..............................................524
Vector multiply even halfwords .....................................................................526
Vector multiply halfwords..............................................................................528
Vector multiply halfwords with round and pack............................................530
Vector multiply halfwords signed by unsigned ..............................................532
Vector reduce multiply halfwords..................................................................534
Vector multiply bytes .....................................................................................536
Vector polynomial multiply halfwords ...........................................................538
11.10.6 XTYPE PERM.............................................................................................................540
Saturate..........................................................................................................542
Swizzle bytes ..................................................................................................544
80-N2040-53 Rev. AA 11
Qualcomm Hexagon V73 Programmer’s Reference Manual
80-N2040-53 Rev. AA 12
Qualcomm Hexagon V73 Programmer’s Reference Manual
80-N2040-53 Rev. AA 13
Qualcomm Hexagon V73 Programmer’s Reference Manual
Figures
Figure 1-1 Hexagon V73 processor architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19
Figure 1-2 Vector instruction example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .23
Figure 1-3 Instruction classes and combinations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .26
Figure 1-4 Register field symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .29
Figure 2-1 General registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .32
Figure 2-2 Control registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35
Figure 3-1 Packet grouping combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .51
Figure 4-1 Vector byte operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .58
Figure 4-2 Vector halfword operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .58
Figure 4-3 Vector word operation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .58
Figure 4-4 64-bit shift and add/sub/logical . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .67
Figure 4-5 Vector halfword shift right. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .70
Figure 5-1 Hexagon processor byte order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .80
Figure 5-2 L2fetch instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .99
Figure 6-1 Vector byte compare . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .110
Figure 6-2 Vector halfword compare . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .110
Figure 6-3 Vector mux instruction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .111
Figure 7-1 Stack structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .115
Figure 10-1 Instruction packet encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .158
80-N2040-53 Rev. AA 14
Qualcomm Hexagon V73 Programmer’s Reference Manual
Tables
Table 1-1 Register symbols. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .28
Table 1-2 Register bit field symbols. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .29
Table 1-3 Instruction operands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .30
Table 1-4 Data symbols. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .31
Table 2-1 General register aliases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .34
Table 2-2 General register pairs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .34
Table 2-3 Aliased control registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .36
Table 2-4 Control register pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .37
Table 2-5 Loop registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .38
Table 2-6 User status register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .39
Table 2-7 Modifier registers (indirect auto-increment addressing) . . . . . . . . . . . . . . . . . . . . . .41
Table 2-8 Modifier registers (circular addressing) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .41
Table 2-9 Modifier registers (bit-reversed addressing) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .42
Table 2-10 Predicate registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .42
Table 2-11 Circular start registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .43
Table 2-12 User general pointer register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Table 2-13 Global pointer register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .43
Table 2-14 Cycle count registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .44
Table 2-15 Frame limit register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .44
Table 2-16 Frame key register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .45
Table 2-17 Packet count registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .45
Table 2-18 Qtimer registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .46
Table 3-1 Instruction symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .47
Table 3-2 Instruction classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .48
Table 4-1 Single-precision multiply options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .65
Table 4-2 Double precision multiply options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .65
Table 4-3 Control register transfer instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .71
Table 5-1 Memory alignment restrictions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .81
Table 5-2 Load instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .81
Table 5-3 Store instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .82
Table 5-4 Mem-ops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .84
Table 5-5 Addressing modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .84
Table 5-6 Offset ranges (global pointer relative) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .86
Table 5-7 Offset ranges (indirect with offset). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .87
Table 5-8 Increment ranges (indirect with auto-inc immediate) . . . . . . . . . . . . . . . . . . . . . . . .88
Table 5-9 Increment ranges (circular with auto-inc immediate). . . . . . . . . . . . . . . . . . . . . . . . .89
Table 5-10 Increment ranges (circular with auto-inc register) . . . . . . . . . . . . . . . . . . . . . . . . . . .91
Table 5-11 Addressing modes (conditional load/store) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .93
Table 5-12 Conditional offset ranges (indirect with offset) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .94
Table 5-13 Cache instructions (user-level) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .96
Table 5-14 Memory ordering instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .100
80-N2040-53 Rev. AA 15
Qualcomm Hexagon V73 Programmer’s Reference Manual
80-N2040-53 Rev. AA 16
1 Introduction
The Qualcomm Hexagon™ processor is a general-purpose digital signal processor designed for
high performance and low power across a wide variety of multimedia and modem applications.
V73 is a member of the sixth generation of the Hexagon processor architecture.
1.1.1 Memory
The Hexagon processor features a unified byte-addressable memory. This memory has a single
32-bit virtual address space, which holds both instructions and data. It operates in little-endian
mode.
The load/store architecture supports a complete set of addressing modes for both compiler code
generation and DSP application programming.
1.1.2 Registers
The Hexagon processor has two sets of registers: General registers and Control registers.
The general registers include thirty-two 32-bit registers (named R0 through R31), which are
accessed either as single registers or as aligned 64-bit register pairs. The general registers contain
all data, including pointer, scalar, vector, and accumulator data.
The control registers include special-purpose registers such as program counter, status register,
loop registers, and so on.
80-N2040-53 Rev. AA 17
Qualcomm Hexagon V73 Programmer’s Reference Manual Introduction
Load/ Load/
64
4x32 bit Store 64 Store
128 Instructions S3: X Unit
XTYPE Instructions
ALU32 Instructions
J Instructions
Sequencer CR Instructions
Packets of
1-4 instructions
S2: X Unit
XTYPE Instructions
ALU32 Instructions
J Instructions
JR Instructions
General
Control Registers
Hardware Loop Regs
Registers
Modifier Registers S1: Load/Store
Status Register Unit R0-R31
Program Counter LD Instructions
Predicate Registers ST Instructions
User General Pointer ALU32 Instructions
Global Pointer
Circular Start Registers
S0: Load/Store
Unit
LD Instructions
ST Instructions
ALU32 Instructions
MEMOP Instructions
NV Instructions
SYSTEM Instructions
80-N2040-53 Rev. AA 18
Qualcomm Hexagon V73 Programmer’s Reference Manual Introduction
Auto-increment with register addressing uses one of the two dedicated address-modify registers
M0 and M1 (which are part of the control registers).
NOTE: Atomic memory operations (load locked/store conditional) are supported to implement multi-
thread synchronization.
80-N2040-53 Rev. AA 19
Qualcomm Hexagon V73 Programmer’s Reference Manual Introduction
The loop instructions support nestable loops, with few restrictions on their use.
Software branches use a predicated branch mechanism. Explicit compare instructions generate a
predicate bit, which is then tested by conditional branch instructions. For example:
P1 = cmp.eq(R2, R3)
if (P1) jump end
Jumps and subroutine calls are conditional or unconditional, and support both PC-relative and
register indirect addressing modes. For example:
jump end
jumpr R1
call function
callr R2
The subroutine call instructions store the return address in register R31. Subroutine returns are
performed using a jump indirect instruction through this register. For example:
jumpr R31 // Subroutine return
80-N2040-53 Rev. AA 20
2 Registers
General registers are used for general-purpose computation, including address generation, and
scalar and vector arithmetic.
Control registers support special-purpose processor features such as hardware loops and
predicates.
R3 R2 R1 R0
..
R3:2 R1:0
.
R31 R30 R29 R28
R31:30 R29:28
80-N2040-53 Rev. AA 21
Qualcomm Hexagon V73 Programmer’s Reference Manual Registers
Aliased registers
Three of the general registers – R29 through R31 – support subroutines (Section 8.3.2) and the
Software Stack. The subroutine and stack instructions implicitly modify the registers. They have
symbol aliases that indicate when these registers are accessed as subroutine and stack registers.
For example:
SP = add(SP, #-8) // SP is alias of R29
allocframe // Modifies SP (R29) and FP (R30)
call init // Modifies LR (R31)
Register pairs
The general registers can be specified as register pairs that represent a single 64-bit register. For
example:
R1:0 = memd(R3) // Load doubleword
R7:6 = valignb(R9:8,R7:6, #2) // Vector align
NOTE: The first register in a register pair must always be odd-numbered, and the second must be the
next lower register.
80-N2040-53 Rev. AA 22
Qualcomm Hexagon V73 Programmer’s Reference Manual Registers
NOTE: When a control register is used in a register transfer, the other operand must be a general
register.
80-N2040-53 Rev. AA 23
Qualcomm Hexagon V73 Programmer’s Reference Manual Registers
M0 PKTCOUNTLO
Modifier registers Packet count registers
M1 PKTCOUNTHI
GP Global pointer
CS0
Circular start registers
CS1
80-N2040-53 Rev. AA 24
Qualcomm Hexagon V73 Programmer’s Reference Manual Registers
Aliased registers
The control registers have numeric aliases (C0 through C31).
NOTE: The control register numbers (0 through 31) specify the control registers in Instruction Encodings.
80-N2040-53 Rev. AA 25
Qualcomm Hexagon V73 Programmer’s Reference Manual Registers
Register pairs
The control registers can be specified as register pairs that represent a single 64-bit register.
Control registers specified as pairs must use their numeric aliases. For example:
C1:0 = R5:4 // C1:0 specifies the LC0/SA0 register pair
NOTE: The first register in a control register pair must always be odd-numbered, and the second must be
the next lower register.
80-N2040-53 Rev. AA 26
Qualcomm Hexagon V73 Programmer’s Reference Manual Registers
NOTE: A user control register transfer to USR cannot be grouped in an instruction packet with a Floating
point instruction.
When a transfer to USR changes the enable trap bits [29:25], an isync instruction (Section 5.11)
must execute before the new exception programming can take effect.
80-N2040-53 Rev. AA 27
Qualcomm Hexagon V73 Programmer’s Reference Manual Registers
80-N2040-53 Rev. AA 28
Qualcomm Hexagon V73 Programmer’s Reference Manual Registers
80-N2040-53 Rev. AA 29
Qualcomm Hexagon V73 Programmer’s Reference Manual Registers
Indirect auto-increment
In Indirect with auto-increment register addressing the modifier registers store a signed 32-bit
value that specifies the increment (or decrement) value.
Circular
In circular addressing (Section 5.8.10) the modifier registers store the circular buffer length and
related “I” values.
Bit-reversed
In bit-reversed addressing (Section 5.8.12) the modifier registers store a signed 32-bit value that
specifies the increment (or decrement) value.
The four predicate registers can be specified as a register quadruple (P3:0) that represents a
single 32-bit register.
80-N2040-53 Rev. AA 30
Qualcomm Hexagon V73 Programmer’s Reference Manual Registers
NOTE: Unlike the other control registers, the predicate registers are only 8 bits wide because vector
compares return a maximum of eight status results.
80-N2040-53 Rev. AA 31
Qualcomm Hexagon V73 Programmer’s Reference Manual Registers
80-N2040-53 Rev. AA 32
Qualcomm Hexagon V73 Programmer’s Reference Manual Registers
NOTE: The RTOS must grant permission to access these registers. Without this permission, reading these
registers from user code returns zero.
80-N2040-53 Rev. AA 33
Qualcomm Hexagon V73 Programmer’s Reference Manual Registers
Packet counting can be configured to operate only in specific sets of processor modes (for
example, User mode only, or Guest and Monitor modes only). Bits [12:10] in the User status
register control the configuration for each mode.
Packets with exceptions are not counted as committed packets.
80-N2040-53 Rev. AA 34
Qualcomm Hexagon V73 Programmer’s Reference Manual Registers
NOTE: Each hardware thread has its own set of packet count registers.
The RTOS must grant permission to access these registers. Without this permission, reading these
registers from user code returns zero.
When a value is written to a PKTCOUNT register, the 64-bit packet count value is incremented
before the value is stored in the register.
These registers are read-only – hardware automatically updates these registers to contain the
current Qtimer value.
NOTE: The RTOS must grant permission to access these registers. Without this permission, reading these
registers from user code returns zero.
80-N2040-53 Rev. AA 35
3 Instructions
Instruction encoding is described in Chapter 10. For detailed descriptions of the Hexagon
processor instructions, see Chapter 11.
The item specified on the left-hand side (LHS) of the equation is assigned the value specified by
the right-hand side (RHS). For example:
R2 = add(R3,R1) // Add R3 and R1, assign result to R2
■ Courier font is used for instructions
■ Square brackets enclose optional items (for example, [:sat], means that saturation is
optional)
■ Braces indicate a choice of items (for example, {Rs,#s16}, means that either Rs or a signed
16-bit immediate can be used)
80-N2040-53 Rev. AA 36
Qualcomm Hexagon V73 Programmer’s Reference Manual Instructions
The #uN, #sN, and #mN symbols specify immediate operands in instructions. The # symbol
appears in the actual instruction to indicate the immediate operand.
The #rN symbol specifies loop and branch destinations in instructions. The # symbol does not
appear in the actual instruction; instead, the entire #rN symbol (including its :S suffix) is
expressed as a loop or branch symbol whose numeric value is determined by the assembler and
linker. For example:
80-N2040-53 Rev. AA 37
Qualcomm Hexagon V73 Programmer’s Reference Manual Instructions
The :S suffix indicates that the S least-significant bits in a value are implied zero bits and
therefore not encoded in the instruction. The implied zero bits are called scale bits.
For example, #s4:2 denotes a signed immediate operand represented by four bits encoded in the
instruction, and two scale bits. The possible values for this operand are -32, -28, -24, -20, -16, -12,
-8, -4, 0, 4, 8, 12, 16, 20, 24, and 28.
The ## symbol specifies a 32-bit immediate operand in an instruction (including a loop or branch
destination). The ## symbol indicates the operand in the actual instruction.
Examples of operand symbols:
Rd = add(Rs,#s16) // #s16 -> signed 16-bit imm value
Rd = memw(Rs++#s4:2) // #s4:2 -> scaled signed 4-bit imm value
call #r22:2 // #r22:2 -> scaled 22-bit PC-rel addr value
Rd = ##u32 // ##u32 -> unsigned 32-bit imm value
NOTE: When an instruction contains more than one immediate operand, the operand symbols are
specified in upper and lower case (for example, #uN and #UN) to indicate where they appear in
the instruction encodings
3.1.2 Terminology
Table 3-3 lists the symbols Hexagon processor instruction names use to specify the supported
data types.
80-N2040-53 Rev. AA 38
Qualcomm Hexagon V73 Programmer’s Reference Manual Instructions
The ds field indicates the register operand type and bit size (as defined in Table 3-4).
The optional elst (element size and type) field specifies parts of a register when the register is
used as a vector. It can specify the following values:
■ A signed or unsigned byte, halfword, or word within the register (as defined in Figure 3-1)
■ A bit field within the register (as defined in Table 3-5)
Examples of elst field:
EA = Rt.h[1] // .h[1] -> bit field 31:16 in Rt
Pd = (Rss.u64 > Rtt.u64) // .u64 -> unsigned 64-bit value
Rd = mpyu(Rs.L,Rt.H) // .L/.H -> low/high 16-bit fields
NOTE: The control and predicate registers use the same notation as the general registers, but are written
as Cx and Px (respectively) instead of Rx.
80-N2040-53 Rev. AA 39
Qualcomm Hexagon V73 Programmer’s Reference Manual Instructions
Rds.elst
s, t, u 32-bit source register
d 32-bit register destination
x 32-bit register source/destination
ss, tt, uu 64-bit source register pair
dd 64-bit register destination
Rds.elst xx 64-bit register source/destination
.b[7] .b[6] .b[5] .b[4] .b[3] .b[2] .b[1] .b[0] Signed bytes
.ub[7] .ub[6] .ub[5] .ub[4] .ub[3] .ub[2] .ub[1] .ub[0] Unsigned bytes
80-N2040-53 Rev. AA 40
Qualcomm Hexagon V73 Programmer’s Reference Manual Instructions
80-N2040-53 Rev. AA 41
Qualcomm Hexagon V73 Programmer’s Reference Manual Instructions
Packets have various restrictions on the allowable instruction combinations. The primary
restriction is determined by the instruction class of the instructions in a packet. In particular,
packet formation is subject to the following constraints:
■ Resource constraints determine how many instructions of a specific type can appear in a
packet. The Hexagon processor has a fixed number of execution units: each instruction
executes on a particular type of unit, and each unit can process at most one instruction at a
time. Thus, for example, because the Hexagon processor contains only two load units, an
instruction packet with three load instructions is invalid.
■ Grouping constraints are a small set of rules that apply above and beyond the resource
constraints.
■ Dependency constraints ensure that no write-after-write hazards exist in a packet.
■ Ordering constraints dictate the ordering of instructions within a packet.
■ Alignment constraints dictate the placement of packets in memory.
NOTE: The Hexagon processor executes individual instructions (which are not explicitly grouped in
packets) as packets containing a single instruction.
80-N2040-53 Rev. AA 42
Qualcomm Hexagon V73 Programmer’s Reference Manual Instructions
In the first phase, registers R3 and R2 are read from the register file. Then, after execution, R2 is
written with the old value of R3 and R3 is written with the old value of R2. The result of this
packet is the swap of the values of R2 and R3.
NOTE: Dual stores, Dual jumps, New-value stores, New-value compare jumps, and Dot-new predicates
have non-parallel execution semantics.
NOTE: The endloopN instructions (Section 8.2.2) do not use any slots.
Each instruction belongs to specific Instruction classes. For example, jumps belong to instruction
class J, while loads belong to instruction class LD. An instruction’s class determines which slot it
can execute in.
80-N2040-53 Rev. AA 43
Qualcomm Hexagon V73 Programmer’s Reference Manual Instructions
Figure 3-2 shows which instruction classes can be assigned to each of the four slots.
Slot 0 Slot 1 Slot 2 Slot 3
LD instructions LD instructions
XTYPE instructions XTYPE instructions
ST instructions ST instructions
ALU32 instructions ALU32 instructions
ALU32 instructions ALU32 instructions
J instructions J instructions
MEMOP instructions Some J instructions
JR instructions CR instructions
NV instructions
SYSTEM instructions
Some J instructions
80-N2040-53 Rev. AA 44
Qualcomm Hexagon V73 Programmer’s Reference Manual Instructions
■ JR-class instructions can be placed in Slot 2. However, when encoded in a duplex jumpr
instruction, R31 can be placed in Slot 0 (Section 10.3).
■ Restrictions limit the instructions that can appear in a packet at the setup or end of a
hardware loop (Section 8.2.4).
■ A user control register transfer to the control register USR cannot be grouped with a floating
point instruction (Section 2.2.3).
■ The SYSTEM-class instructions include prefetch, cache operations, bus operations, load
locked, and store conditional instructions (Section 5.10). These instructions have the
following grouping rules:
❒ The brkpt, trap, pause, icinva, isync, and syncht instructions are solo instructions. They
must not be grouped with other instructions in a packet.
❒ The memw_locked, memd_locked, l2fetch, and trace instructions must execute on Slot 0.
They must be grouped only with ALU32 or (non-FP) XTYPE instructions.
❒ The dccleana, dcinva, dccleaninva, and dczeroa instructions must execute on Slot 0. Slot 1
must be empty or an ALU32 instruction.
80-N2040-53 Rev. AA 45
Qualcomm Hexagon V73 Programmer’s Reference Manual Instructions
For example, if a packet contains three instructions and slot 1 is not used, encode the instructions
in the packet as follows:
■ Slot 3 instruction at lowest address
■ Slot 2 instruction follows Slot 3 instruction
■ Slot 0 instructions at the last (highest) address
If a packet contains a single load or store instruction, that instruction must go in Slot 0, which is
the highest address. As an example, a packet containing both LD and ALU32 instructions must be
ordered so the LD is in Slot 0 and the ALU32 in another slot.
80-N2040-53 Rev. AA 46
Qualcomm Hexagon V73 Programmer’s Reference Manual Instructions
NOTE: Compound instructions (with the exception of X-and-jump, as shown above) have distinct
assembly syntax from the instructions they are composed of.
80-N2040-53 Rev. AA 47
4 Data Processing
The Hexagon processor provides a rich set of operations for processing scalar and vector data.
Instructions can perform a wide variety of operations on fixed-point or floating-point data. The
fixed-point operations support scalar and vector data in a variety of sizes. The floating-point
operations support single-precision data.
This chapter presents an overview of the operations provided by the following Hexagon processor
instruction classes:
■ XTYPE – General-purpose data operations
■ ALU32 – Arithmetic/logical operations on 32-bit data
80-N2040-53 Rev. AA 48
Qualcomm Hexagon V73 Programmer’s Reference Manual Data Processing
NOTE: Certain vector operations support automatic scaling, saturation, and rounding.
Rtt
* * * *
32 32
32 32
Add
64
Rdd
64-bit register pair
80-N2040-53 Rev. AA 49
Qualcomm Hexagon V73 Programmer’s Reference Manual Data Processing
80-N2040-53 Rev. AA 50
Qualcomm Hexagon V73 Programmer’s Reference Manual Data Processing
Rss
Rtt
Op Op Op Op Op Op Op Op
Rdd
Four 16-bit halfword values can be packed in a single 64-bit register pair.
Rss
Rtt
Op Op Op Op
Rdd
Two 32-bit word values can be packed in a single 64-bit register pair.
Rss
Rtt
Op Op
Rdd
80-N2040-53 Rev. AA 51
Qualcomm Hexagon V73 Programmer’s Reference Manual Data Processing
When two fractional numbers are multiplied, the product must be scaled to restore the original
fractional data format. The Hexagon processor allows specification of the fractional scaling of the
product in the instruction for shifts of 0 and 1. Perform a shift of 1 for Q1.15 numbers, perform a
shift of 0 for integer multiplication.
4.2.2 Saturation
Certain instructions are available in saturating form. If a saturating arithmetic instruction has a
result which is smaller than the minimum value, the result is set to the minimum value. Similarly,
if the operation has a result which is greater than the maximum value, the result is set to the
maximum value.
Saturation is specified in an instruction by adding the:sat specifier. For example:
R2 = abs(R1):sat
The open virtualization format (OVF) bit in the User status register is set whenever a saturating
operation saturates to the maximum or minimum value. It remains set until explicitly cleared by a
control register transfer to USR. For vector-type saturating operations, if any of the individual
elements of the vector saturate, OVF is set.
NOTE: Arithmetic rounding can accumulate numerical errors, especially when the number to round is
exactly 0.5. This happens most frequently when dividing by 2 or averaging.
80-N2040-53 Rev. AA 52
Qualcomm Hexagon V73 Programmer’s Reference Manual Data Processing
For single precision, the scaling factor is two raised to the power specified by the contents of the
predicate register (which is treated as an 8-bit two's complement value). For double precision, the
predicate register value is doubled before use as a power of two.
NOTE: Do not use scale FMA instructions outside of divide and square-root library routines. No
guarantee is provided that future versions of the Hexagon processor will implement these
instructions using the same semantics. Future versions assume only that compatibility for scale
FMA is limited to the needs of divide and square-root library routines.
80-N2040-53 Rev. AA 53
Qualcomm Hexagon V73 Programmer’s Reference Manual Data Processing
4.3.1 ALU
XTYPE ALU operations modify 8-, 16-, 32-, and 64-bit data. These operations include:
■ Add and subtract with and without saturation
■ Add and subtract with accumulate
■ Absolute value
■ Logical operations
■ Min, max, negate instructions
■ Register transfers of 64-bit data
■ Word to doubleword sign extension
■ Comparisons
80-N2040-53 Rev. AA 54
Qualcomm Hexagon V73 Programmer’s Reference Manual Data Processing
4.3.3 Complex
XTYPE COMPLEX operations manipulate complex numbers. These operations include:
■ Complex add and subtract
■ Complex multiply with optional round and pack
■ Vector complex multiply
■ Vector complex conjugate
■ Vector complex rotate
■ Vector reduce complex multiply real or imaginary
NOTE: The special floating-point instructions are not intended for use directly in user code – use the
special floating-point instructions only in the floating point library.
Format conversion
The floating-point conversion instructions sfmake and dfmake convert an unsigned 10-bit
immediate value into the corresponding floating-point value.
The immediate value must be encoded so bits [5:0] contain the significand, and bits [9:6] the
exponent. The exponent value is added to the initial exponent value (bias - 6).
For example, to generate the single-precision floating point value 2.0, bits [5:0] must be set to 0,
and bits [9:6] set to 7. Performing the sfmake operation on this immediate value yields the
floating point value 0x40000000, which is 2.0.
80-N2040-53 Rev. AA 55
Qualcomm Hexagon V73 Programmer’s Reference Manual Data Processing
NOTE: The conversion instructions are designed to handle common floating point values, including most
integers and many basic fractions (1/2, 3/4, and so on).
Rounding
The Hexagon User status register includes the FPRND field, which specifies the IEEE-defined
floating-point rounding mode.
Exceptions
The Hexagon user status register includes five status fields, which work as sticky flags for the five
IEEE-defined exception conditions: inexact, overflow, underflow, divide by zero, and invalid. A
sticky flag is set when the corresponding exception occurs, and remains set until explicitly
cleared.
The user status register also includes five mode fields which specify whether to perform an
operating-system trap if one of the floating-point exceptions occur. For every instruction packet
containing a floating-point operation, if a floating-point sticky flag and the corresponding trap-
enable bit are both set, a floating-point trap is generated. After the packet commits, the Hexagon
processor then automatically traps to the operating system.
NOTE: Non-floating-point instructions never generate a floating-point trap, regardless of the state of the
sticky flag and trap-enable bits.
4.3.5 Multiply
Multiply operations support fixed-point multiplication, including both single- and double-
precision multiplication, and polynomial multiplication.
Single precision
In single-precision arithmetic a 16-bit value is multiplied by another 16-bit value. These operands
can come from the high portion or low portion of any register. Depending on the instruction, the
result of the 16 × 16 operation can optionally be accumulated, saturated, rounded, or shifted left
by 0 to 1 bits.
The instruction set supports operations on signed × signed, unsigned × unsigned, and signed ×
unsigned data.
80-N2040-53 Rev. AA 56
Qualcomm Hexagon V73 Programmer’s Reference Manual Data Processing
Table 4-1 summarizes the options available for 16 × 16 single precision multiplications. The
symbols used in the table are as follows:
■ SS – Perform signed × signed multiply
■ UU – Perform unsigned × unsigned multiply
■ SU – Perform signed × unsigned multiply
■ A+ – Result added to accumulator
■ A- – Result subtracted from accumulator
■ 0 – Result not added to accumulator
16 × 16 64 SS A+, A- No No 0-1
16 × 16 64 SS 0 No Yes 0-1
Double precision
32 × 32 32 (upper) SU 0 No No 0
80-N2040-53 Rev. AA 57
Qualcomm Hexagon V73 Programmer’s Reference Manual Data Processing
Polynomial
Polynomial XTYPE MPY instructions are available for both words and vector halfwords.
These instructions are useful for many algorithms including scramble code generation,
cryptographic algorithms, convolutional, and Reed Solomon code.
4.3.6 Permute
XTYPE PERM operations perform various operations on vector data, including arithmetic, format
conversion, and rearrangement of vector elements. Many types of conversions are supported:
■ Swizzle bytes
■ Vector shuffle
■ Vector align
■ Vector saturate and pack
■ Vector splat bytes
■ Vector splice
■ Vector sign extend halfwords
■ Vector zero extend bytes
■ Vector zero extend halfwords
■ Scalar saturate to byte, halfword, word
■ Vector pack high and low halfwords
■ Vector round and pack
■ Vector splat halfwords
4.3.7 Predicate
XTYPE PRED operations modify predicate source data. The categories of instructions available
include:
■ Vector mask generation
■ Predicate transfers
■ Viterbi packing
80-N2040-53 Rev. AA 58
Qualcomm Hexagon V73 Programmer’s Reference Manual Data Processing
4.3.8 Shift
Scalar XTYPE SHIFT operations perform a variety of 32 and 64-bit shifts followed by an optional
add/sub or logical operation. Figure 4-4 shows the general operation.
Rss
# / Rt
Shift
amount
64-bit shifter
64-bit add/sub/logical
Rxx
80-N2040-53 Rev. AA 59
Qualcomm Hexagon V73 Programmer’s Reference Manual Data Processing
80-N2040-53 Rev. AA 60
Qualcomm Hexagon V73 Programmer’s Reference Manual Data Processing
The vector halfword operations process packed 16-bit halfwords. They include the following
operations:
■ Vector add and subtract halfwords
■ Vector average halfwords
■ Vector compare halfwords
■ Vector min and max halfwords
■ Vector shift halfwords
■ Vector dual multiply
■ Vector dual multiply with round and pack
■ Vector multiply even halfwords with optional round and pack
■ Vector multiply halfwords
■ Vector reduce multiply halfwords
For example, Figure 4-5 shows the operation of the vector arithmetic shift right halfword (vasrh)
instruction. In this instruction, each 16-bit half-word is shifted right by the same amount which is
specified in a register or with an immediate value. Because the shift is arithmetic, the bits shifted
in are copies of the sign bit.
80-N2040-53 Rev. AA 61
Qualcomm Hexagon V73 Programmer’s Reference Manual Data Processing
For more information on vector operations see Section 11.1.1 and Section 11.10.1.
4.6 CR operations
The CR instruction class includes operations that access the Control registers.
NOTE: In register-pair transfers, control registers must be specified using their numeric alias names – see
Section 2.2 for details.
80-N2040-53 Rev. AA 62
Qualcomm Hexagon V73 Programmer’s Reference Manual Data Processing
rLPS rLPS
range
range
rMPS rMPS
offset
offset
80-N2040-53 Rev. AA 63
Qualcomm Hexagon V73 Programmer’s Reference Manual Data Processing
bitpos=Count_leading_zeros(range)
rLPS=lutLPS[ctxIdx->state][(range>>(29-bitpos))&3]<<(23-bitpos)
rMPS=range-rLPS
bin = !ctxIdx->valMPS
bin = ctxIdx->valMPS
range = rLPS
range = rMPS
offset = offset - rMPS
ctxIdx->state == 0 Yes
ctxIdx->valMPS=!ctxIdx->valMPS
No
ctxIdx->state = ctxIdx->state =
TransIndexLPS(ctxIdx->state) TransIndexMPS(ctxIdx->state)
Renormalization1
(range, offset)
Done
80-N2040-53 Rev. AA 64
Qualcomm Hexagon V73 Programmer’s Reference Manual Data Processing
The Hexagon processor can use the decbin instruction to decode one regular bin in two cycles
(not counting the bin refilling process).
For more information on the decbin instruction see Section 11.10.6.
For example:
Rdd = decbin(Rss,Rtt)
OUTPUT: P0
P0 = (bin)
// Cycle #1
{ R1:0= decbin(R1:0,R5:4) // Decoding one bin
R6 = asl(R22,R5) // Where R22 = 0x100
}
// Cycle #2
{ memb(R3) = R0 // Save context to *ctxIdx
R1:0 = vlsrw(R1:0,R5) // Re-align range and offset
P1 = cmp.gtu(R6,R1) // Need refill? i.e., P1= (range<0x100)
IF (!P1.new) jumpr:t LR // Return
}
RENORM_REFILL:
...
80-N2040-53 Rev. AA 65
Qualcomm Hexagon V73 Programmer’s Reference Manual Data Processing
NOTE: This operation utilizes the maximum load bandwidth available in the Hexagon processor.
80-N2040-53 Rev. AA 66
Qualcomm Hexagon V73 Programmer’s Reference Manual Data Processing
fast_ip_check:
{
R1 = lsr(R1,#4) // 16-byte chunks, rounded down, +1
R9:8 = combine(#0,#0)
R3:2 = combine(#0,#0)
}
{
loop0(1f,R1)
R7:6 = memd(R0+#8)
R5:4 = memd(R0++#16)
}
.falign
1:
{
R7:6 = memd(R0+#8)
R5:4 = memd(R0++#16)
R2 = vradduh(R5:4,R7:6) // Accumulate 8 halfwords
R8 = vradduh(R3:2,R9:8) // Accumulate carries
}:endloop0
// Drain pipeline
{
R2 = vradduh(R5:4,R7:6)
R8 = vradduh(R3:2,R9:8)
R5:4 = combine(#0,#0)
}
{
R8 = vradduh(R3:2,R9:8)
R1 = #0
}
// May have some carries to add back in
{
R0 = vradduh(R5:4,R9:8)
}
// Possible for one more to pop out
{
R0 = vradduh(R5:4,R1:0)
}
{
R0 = not(R0)
jumpr LR
}
80-N2040-53 Rev. AA 67
Qualcomm Hexagon V73 Programmer’s Reference Manual Data Processing
Rt
1 j -1 -j 1 j -1 -j 1 j -1 -j 1 j -1 -j
* * * *
+ +
I R Rxx
For more information on the vrcrotate instruction, see Vector reduce complex rotate.
NOTE: Using this instruction the Hexagon processor can process 5.3 chips per cycle, and a 12-finger
WCDMA user requires only 15 MHz.
80-N2040-53 Rev. AA 68
Qualcomm Hexagon V73 Programmer’s Reference Manual Data Processing
Rxx += vpmpyh(Rs,Rt)
Rxx += pmpyw(Rs,Rt)
Rs
Rs
Rt
Rt
16 × 16
carryless 16 × 16
32 × 32
carryless
polynomial
multiply * * carryless
polynomial
*
multiply
polynomial
multiply
XOR XOR
XOR
Rxx Rxx
For more information on the pmpy instructions, see Polynomial multiply words.
80-N2040-53 Rev. AA 69
5 Memory
The Hexagon processor features a load/store architecture, where numeric and logical instructions
operate on registers. Explicit load instructions move operands from memory to registers, while
store instructions move operands from registers to memory. A small number of instructions
(known as mem-ops) perform numeric and logical operations directly on memory.
The address space is unified: all accesses target the same linear address space, which contains
both instructions and data.
Address Contents
0 A Register Contents
1 B
31 0
2 C
- - - A Load byte
3 D
4 E - - B A Load Halfword
5 F
D C B A Load word
6 G 63
7 H H G F E D C B A Load doubleword
80-N2040-53 Rev. AA 70
Qualcomm Hexagon V73 Programmer’s Reference Manual Memory
5.1.3 Alignment
Even though the Hexagon processor memory is byte-addressable, instructions and data must be
aligned in memory on specific address boundaries:
■ Instructions and instruction packets must be 32-bit aligned
■ Data must be aligned to its native access size.
Any unaligned memory access causes a memory-alignment exception.
Use the Permute instructions in applications that must reference unaligned vector data. The loads
and stores still must be memory-aligned; however, the permute instructions enable easy
rearrangement of the data in registers.
80-N2040-53 Rev. AA 71
Qualcomm Hexagon V73 Programmer’s Reference Manual Memory
NOTE: The memory load instructions belong to instruction class LD, and can execute only in slots 0 or 1.
80-N2040-53 Rev. AA 72
Qualcomm Hexagon V73 Programmer’s Reference Manual Memory
NOTE: The memory store instructions belong to instruction class ST, and can execute only in slot 0 or –
when part of a dual store – slot 1.
80-N2040-53 Rev. AA 73
Qualcomm Hexagon V73 Programmer’s Reference Manual Memory
Unlike most packetized operations, dual stores do not execute in parallel (Section 3.3.1). Instead,
the store instruction in Slot 1 effectively executes first, followed by the store instruction in Slot 0.
NOTE: The store instructions in a dual store must belong to instruction class ST, and can execute only in
Slots 0 and 1.
Unlike most packetized operations, these memory operations do not execute in parallel
(Section 3.3.1). Instead, the store instruction in Slot 1 effectively executes first, followed by the
load instruction in Slot 0. If the addresses of the two operations are overlapping, the load receives
the newly stored data.
80-N2040-53 Rev. AA 74
Qualcomm Hexagon V73 Programmer’s Reference Manual Memory
NOTE: The new-value store instructions belong to instruction class NV, and can execute only in Slot 0.
5.7 Mem-ops
Mem-ops perform basic arithmetic, logical, and bit operations directly on memory operands,
without the need for a separate load or store. Mem-ops can be performed on byte, halfword, or
word sizes.
NOTE: The mem-op instructions belong to instruction class MEMOP, and can execute only in slot 0.
80-N2040-53 Rev. AA 75
Qualcomm Hexagon V73 Programmer’s Reference Manual Memory
5.8.1 Absolute
The absolute addressing mode uses a 32-bit constant value as the effective memory address. For
example:
R2 = memw(##100000) // Load R2 with word from addr 100000
memw(##200000) = R4 // Store R4 to word at addr 200000
5.8.2 Absolute-set
The absolute-set addressing mode assigns a 32-bit constant value to the specified general
register, then uses the assigned value as the effective memory address. For example:
R2 = memw(R1=##400000) // Load R2 with word from addr 400000
// and load R1 with value 400000
memw(R3=##600000) = R4 // Store R4 to word at addr 600000
// and load R3 with value 600000
The 32-bit constant value is the base address, and the shifted result is the byte offset.
NOTE: This addressing mode is useful for loading an element from a global table, where the immediate
value is the name of the table, and the register holds the index of the element.
80-N2040-53 Rev. AA 76
Qualcomm Hexagon V73 Programmer’s Reference Manual Memory
Specifying only an immediate value causes the assembler and linker to automatically subtract the
value of the special symbol _SDA_BASE_ from the immediate value, and use the result as the
effective offset from GP.
The global data pointer is programmed in the GDP field of register GP (Section 2.2.8). This field
contains an unsigned 26-bit value that specifies the most significant 26 bits of the 32-bit global
data pointer. The least significant 6 bits of the pointer are always defined as zero.
The memory area referenced by the global data pointer is known as the global data area. It can be
up to 512 KB in length, and – because of the way the global data pointer is defined – must be
aligned to a 64-byte boundary in virtual memory.
When expressed in assembly language, the offset values used in global pointer relative addressing
always specify byte offsets from the global data pointer. The offsets must be integral multiples of
the size of the instruction data type.
NOTE: When using global pointer relative addressing, the immediate operand should be a symbol in the
.sdata or .sbss section to ensure that the offset is valid.
80-N2040-53 Rev. AA 77
Qualcomm Hexagon V73 Programmer’s Reference Manual Memory
5.8.5 Indirect
The indirect addressing mode uses a 32-bit value stored in a general register as the effective
memory address. For example:
R2 = memub(R1) // load R2 with unsigned byte from addr R1
When expressed in assembly language, the offset values always specify byte offsets from the
general register value. The offsets must be integral multiples of the size of the instruction data
type.
NOTE: The offset range is smaller for conditional instructions (Section 5.9).
80-N2040-53 Rev. AA 78
Qualcomm Hexagon V73 Programmer’s Reference Manual Memory
When expressed in assembly language, the increment values always specify byte offsets from the
general register value. The offsets must be integral multiples of the size of the instruction data
type.
When auto-incrementing with a modifier register, the increment is a signed 32-bit value which is
added to the general register. This offers two advantages over auto-increment immediate:
■ A larger increment range
■ Variable increments (since the modifier register can be programmed at runtime)
The increment value always specifies a byte offset from the general register value.
NOTE: The signed 32-bit increment range is identical for all instruction data types (doubleword, word,
halfword, byte).
80-N2040-53 Rev. AA 79
Qualcomm Hexagon V73 Programmer’s Reference Manual Memory
NOTE: If any of these rules are not followed, the execution result is undefined.
80-N2040-53 Rev. AA 80
Qualcomm Hexagon V73 Programmer’s Reference Manual Memory
The following C function describes the behavior of the circular add function:
unsigned int
fcircadd(unsigned int pointer, int offset,
unsigned int M_reg, unsigned int CS_reg)
{
unsigned int length;
int new_pointer, start_addr, end_addr;
When auto-incrementing with a register, the increment is a signed 11-bit value that is added to
the general register. This offers two advantages over circular addressing with immediate
increments:
■ Larger increment ranges
■ Variable increments (since the increment register can be programmed at runtime)
The circular register increment value is programmed in the I field of the modifier register Mx
(Section 2.2.4) as part of setting up the circular data access. This register field holds the signed 11-
bit increment value.
80-N2040-53 Rev. AA 81
Qualcomm Hexagon V73 Programmer’s Reference Manual Memory
Increment values are expressed in units of the buffer element data type, and are automatically
scaled at runtime to the proper data access size.
When programming a circular buffer (with either a register or immediate increment), all the rules
that apply to circular addressing must be followed – for details see Section 5.8.10.
NOTE: If any of these rules are not followed, the execution result is undefined.
The initial values for the address and increment must be set in bit-reversed form, with the
hardware bit-reversing the bit-reversed address value to form the effective address.
The buffer length for a bit-reversed buffer must be an integral power of 2, with a maximum length
of 64K bytes.
To support bit-reversed addressing, buffers must be properly aligned in memory. A bit-reversed
buffer is properly aligned when its starting byte address is aligned to a power of 2 greater than or
equal to the buffer size (in bytes). For example:
int bitrev_buf[256] __attribute__((aligned(1024)));
The bit-reversed buffer declared above is aligned to 1024 bytes because the buffer size is 1024
bytes (256 integer words × 4 bytes), and 1024 is an integral power of 2.
80-N2040-53 Rev. AA 82
Qualcomm Hexagon V73 Programmer’s Reference Manual Memory
The buffer location pointer for a bit-reversed buffer must be initialized so the least-significant 16
bits of the address value are bit-reversed.
The increment value must be initialized to the following value:
bitreverse(buffer_size_in_bytes / 2)
...where bitreverse is defined as bit-reversing the least-significant 16 bits while leaving the
remaining bits unchanged.
NOTE: To simplify the initialization of the bit-reversed pointer, bit-reversed buffers can be aligned to a
64K byte boundary. This initializes the bit-reversed pointer to the base address of the bit-reversed
buffer, with no bit-reversing required for the least-significant 16 bits of the pointer value (which
are set to 0 by the 64K alignment).
Because buffers allocated on the stack only have an alignment of 8 bytes or less, in most cases bit-
reversed buffers should not be declared on the stack.
After a bit-reversed memory access is complete, the general register is incremented by the
register increment value. The value in the general register is never affected by the bit-reversal
that is performed as part of the memory access.
NOTE: The Hexagon processor supports only register increments for bit-reversed addressing – it does not
support immediate increments.
Not all addressing modes are supported in conditional loads and stores. Table 5-11 shows which
modes are supported.
Table 5-11 Addressing modes (conditional load/store)
Addressing mode Conditional
Absolute Yes
Absolute-set No
Absolute with register offset No
80-N2040-53 Rev. AA 83
Qualcomm Hexagon V73 Programmer’s Reference Manual Memory
When a conditional load or store instruction uses Indirect with offset addressing mode, the offset
range is smaller than the range normally defined for indirect-with-offset addressing.
Table 5-12 Conditional and normal offset ranges (indirect with offset addressing)
Offset range Offset range Offset must be
Data type
(conditional) (normal) multiple of
doubleword 0 ... 504 -8192 ... 8184 8
word 0 ... 252 -4096 ... 4092 4
halfword 0 ... 126 -2048 ... 2046 2
byte 0 ... 63 -1024 ... 1023 1
80-N2040-53 Rev. AA 84
Qualcomm Hexagon V73 Programmer’s Reference Manual Memory
■ Write-back caching stores data in the cache without being immediately written to external
memory. Cached data that is inconsistent with external memory is referred to as dirty.
The Hexagon processor includes dedicated cache maintenance instructions that push dirty data
out to external memory.
NOTE: The exception to this rule is the dcfetch operation, which never causes a processor exception.
Whenever maintenance operations are performed on the instruction cache, the isync instruction
(Section 5.11) must execute immediately afterwards. This instruction ensures that subsequent
instructions observe the maintenance operations.
80-N2040-53 Rev. AA 85
Qualcomm Hexagon V73 Programmer’s Reference Manual Memory
80-N2040-53 Rev. AA 86
Qualcomm Hexagon V73 Programmer’s Reference Manual Memory
■ For a cache hit, the specified cache line is cleared (written with all zeros) and made dirty.
■ For a cache miss, the specified cache line is not fetched from external memory. Instead, the
line is allocated in the data cache, cleared, and made dirty.
This instruction is useful in optimizing write-only data. It allows for the use of write-back pages –
which are the most power and performance efficient – without the need to initially fetch the line
to write. This removes unnecessary read bandwidth and latency.
NOTE: The dczeroa operation has the same exception behavior as write-back stores.
A packet with dczeroa must have slot 1 either empty or containing an ALU32 instruction.
80-N2040-53 Rev. AA 87
Qualcomm Hexagon V73 Programmer’s Reference Manual Memory
The l2fetch instruction is nonblocking: it initiates a prefetch operation that is performed in the
background by the prefetch engine while the thread continues to execute Hexagon processor
instructions.
The prefetch engine requests all lines in the specified memory area. If the line(s) of interest are
already resident in the L2 cache, the prefetch engine performs no action. If the lines are not in the
L2 cache, the prefetch engine attempts to fetch them.
The prefetch engine makes a best effort to prefetch the requested data, and attempts to perform
prefetching at a lower priority than demand fetches. This prevents the prefetch engine from
adding bus traffic when the system is under a heavy load.
If a program executes an l2fetch instruction while the prefetch operation from a previous l2fetch
is still active, the prefetch engine halts the current prefetch operation.
NOTE: Executing l2fetch with any bit field operand programmed to zero cancels prefetch activity.
The status of the current prefetch operation is maintained in the PFA field of the user status
register. This field can determine whether a prefetch operation has completed.
With respect to MMU permissions and error checking, the l2fetch instruction behaves similarly to
a load instruction. If the virtual address causes a processor exception, the exception is taken. This
differs from the dcfetch instruction, which is treated as a NOP in the presence of a
translation/protection error.
NOTE: Prefetches are dropped when the generated prefetch address resides on a different page than the
start address. The programmer must use sufficiently large pages to ensure this does not occur.
80-N2040-53 Rev. AA 88
Qualcomm Hexagon V73 Programmer’s Reference Manual Memory
Figure 5-2 shows two examples of using the l2fetch instruction. The first shows a box prefetch,
where a 2D range of memory is defined within a larger frame. The second example shows a
prefetch for a large linear memory area of size (Lines * 128).
L2FETCH for box prefetch L2FETCH for large linear prefetch
31 31 16 15 8 7 0
16 15 8 7 0
Stride
Width
Prefetch
Height 128* Lines
Area
80-N2040-53 Rev. AA 89
Qualcomm Hexagon V73 Programmer’s Reference Manual Memory
Data memory accesses and program memory accesses are treated separately and held in
separate caches. Software should ensure coherency between data and program code if necessary.
For example, with generated or self-modified code, the modified code is placed in the data cache
and can be inconsistent with program cache. The software must explicitly force modified data
cache lines to memory (either by using a write-through policy, or through explicit cache clean
instructions). Use a barrier instruction to ensure completion of the stores. Finally, invalidate
relevant instruction cache contents so the new instructions can be refetched.
Here is the recommended code sequence to change and execute an instruction:
ICINVA(R1) // Clear code from instruction cache
ISYNC // Ensure that ICINVA is finished
MEMW(R1)=R0 // Write the new instruction
DCCLEANINVA(R1) // Force data out of data cache
SYNCHT // Ensure that it is in memory
JUMPR R1 // Can now execute code at R1
NOTE: The memory-ordering instructions must not be grouped with other instructions in a packet,
otherwise the behavior is undefined.
This code sequence differs from the one used in previous processor versions.
80-N2040-53 Rev. AA 90
Qualcomm Hexagon V73 Programmer’s Reference Manual Memory
lockMutex:
R3 = #1
lock_test_spin:
R1 = memw_locked(R0) // Do normal test to wait
P1 = cmp.eq(R1,#0) // for lock to be available
if (!P1) jump lock_test_spin
memw_locked(R0,P0) = r3 // Do store conditional (SC)
if (!P0) jump lock_test_spin // was LL and SC done atomically?
R1 = #0
memw(R0) = R1
80-N2040-53 Rev. AA 91
Qualcomm Hexagon V73 Programmer’s Reference Manual Memory
Atomic memX_locked operations are supported for external accesses that use the advanced
extensible interface (AXI) bus and support atomic operations. To perform load-locked operations
with external memory, the operating system must define the memory page as uncacheable,
otherwise the processor behavior is undefined.
If a load locked operation is performed on an address that does not support atomic operations,
the behavior is undefined.
For atomic operations on cacheable memory, the page attributes must be set to cacheable and
write-back, otherwise the behavior is undefined. Cacheable memory must be used when threads
must synchronize with each other.
NOTE: External memX_locked operations are not supported on the AHB. If they are performed on the
AHB, the behavior is undefined.
80-N2040-53 Rev. AA 92
6 Conditional Execution
The Hexagon processor uses a conditional execution model based on compare instructions that
set predicate bits in one of four 8-bit predicate registers (P0 through P3). These predicate bits can
conditionally execute certain instructions.
Conditional scalar operations examine only the least-significant bit in a predicate register, while
conditional vector operations examine multiple bits in the register.
Branch instructions are the main consumers of the predicate registers.
80-N2040-53 Rev. AA 93
Qualcomm Hexagon V73 Programmer’s Reference Manual Conditional Execution
80-N2040-53 Rev. AA 94
Qualcomm Hexagon V73 Programmer’s Reference Manual Conditional Execution
NOTE: One of the compare instructions (cmp.eq) includes a variant that stores a binary predicate value
(0 or 1) in a general register not a predicate register.
80-N2040-53 Rev. AA 95
Qualcomm Hexagon V73 Programmer’s Reference Manual Conditional Execution
The mux instruction selects either Rs or Rt based on the least significant bit in Ps. If the least-
significant bit in Ps is a 1, Rd is set to Rs, otherwise it is set to Rt.
To perform the corresponding OR operation, the following instructions can compute the negation
of an existing compare (using De Morgan’s law):
■ Pd = !cmp.{eq,gt}(Rs, {#s10,Rt} )
■ Pd = !cmp.gtu(Rs, {#u9,Rt} )
■ Pd = !tstbit(Rs, {#u5,Rt} )
■ Pd = !bitsclr(Rs, {#u6,Rt} )
■ Pd = !bitsset(Rs,Rt)
NOTE: A register transfer from a predicate register to a predicate register has the same auto-AND
behavior as a compare instruction.
80-N2040-53 Rev. AA 96
Qualcomm Hexagon V73 Programmer’s Reference Manual Conditional Execution
The following C statement and the corresponding assembly code that is generated from it by the
compiler is an example of how to use dot-new predicates.
C statement
if (R2 == 4)
R3 = *R4;
else
R5 = 5;
Assembly code
{
P0 = cmp.eq(R2,#4)
if (P0.new) R3 = memw(R4)
if (!P0.new) R5 = #5
}
In the assembly code, a scalar predicate is generated and then consumed twice within the same
instruction packet.
The following conditions apply to using dot-new predicates:
■ The predicate must be generated by an instruction in the same packet. The assembler
normally enforces this restriction, but if the processor executes a packet that violates this
restriction, the execution result is undefined.
■ A single packet can contain both the dot-new and normal forms of predicates. The normal
form examines the old value in the predicate register, rather than the newly-generated value.
For example:
{
P0 = cmp.eq(R2,#4)
if (P0.new) R3 = memw(R4) // Use newly-generated P0 value
if (P0) R5 = #5 // Use previous P0 value
}
80-N2040-53 Rev. AA 97
Qualcomm Hexagon V73 Programmer’s Reference Manual Conditional Execution
Because predicate values change at runtime, the programmer is responsible for ensuring that
such packets are always valid during program execution. If they are invalid, the processor takes
the following actions:
■ When writing to general registers, an error exception is raised.
■ When writing to predicate or control registers, the result is undefined.
80-N2040-53 Rev. AA 98
Qualcomm Hexagon V73 Programmer’s Reference Manual Conditional Execution
Rss
Rtt
1 0 1 0 1 0 1 0 Pd
7 0
Figure 6-2 shows how a vector halfword compare generates a vector predicate. Two 64-bit
vectors of halfwords are being compared. The result is assigned as a vector predicate to the
destination register Pd.
Because a vector halfword compare yields only four truth values, each truth value is encoded as
two bits in the generated vector predicate.
Rss
Rtt
1 1 0 0 1 1 0 0 Pd
7 0
80-N2040-53 Rev. AA 99
Qualcomm Hexagon V73 Programmer’s Reference Manual Conditional Execution
Rss
Rtt
Rdd
Changing the order of the source operands in a mux instruction enables formation of both senses
of the result. For example:
R1:0 = vmux(P0,R3:2,R5:4) // Choose bytes from R3:2 if true
R1:0 = vmux(P0,R5:4,R3:2) // Choose bytes from R3:2 if false
NOTE: By replicating the predicate bits generated by word or halfword compares, the vector mux
instruction can select words or halfwords.
Predicate registers can be transferred to and from the general registers either individually or as
register quadruples (Section 2.2.5).
The Hexagon processor includes dedicated registers and instructions to support a call stack for
subroutine execution.
The stack structure follows standard C conventions.
Saved LR
Saved FP
Higher address
Procedure local
data on stack
Stack frame
Saved LR
Saved FP FP register
Procedure local
data on stack
SP register
Lower address
Unallocated stack
NOTE: The Hexagon processor supports three dedicated stack instructions: allocframe, deallocframe,
and dealloc_return (Section 7.5).
The SP address must always remain 8-byte aligned for the stack instructions to work properly.
NOTE: For leaf functions it is often unnecessary to save FP and LR. In this case, FP contains the frame
pointer of the calling function, not the current function.
NOTE: Stack bounds checking is performed when the processor is in User and Guest modes, but not in
Monitor mode.
NOTE: Each hardware thread has its own instance of the FRAMEKEY register.
NOTE: SP, FP, and LR are aliases of three General registers. These general registers are conventionally
dedicated for use as stack registers.
NOTE: The allocframe and deallocframe instructions load and store the LR and FP registers on the stack
as a single aligned 64-bit register pair (LR:FP).
Two sets of hardware loop instructions are provided – loop0 and loop1 – to nest hardware loops
one level deep. For example:
// Sum the rows of a 100x200 matrix.
loop1(outer_start,#100)
outer_start:
R0 = #0
loop0(inner_start,#200)
inner_start:
R3 = memw(R1++#4)
{ R0 = add(R0,R3) }:endloop0
{ memw(R2++#4) = R0 }:endloop1
NOTE: If a program must create loops nested more than one level deep, the two innermost loops can be
implemented as hardware loops, with the remaining outer loops implemented as software
branches.
In this example, the hardware loop (consisting of a single multiply instruction) executes three
times. The loop0 instruction sets register SA0 to the address value of label start, and LC0 to 3.
Loop counts are limited to the range 0 to1023 when they are expressed as immediate values in
loopN. If the desired loop count exceeds this range, it must be specified as a register value. For
example:
Using loopN:
R1 = #20000;
loop0(start,R1) // LC0=20000, SA0=&start
start:
{ R0 = mpyi(R0,R0) } :endloop0
If a loopN instruction is located too far from its loop start address, the PC-relative offset value that
specifies the start address can exceed the maximum range of the instruction’s start-address
operand. If this occurs, either move the loopN instruction closer to the loop start, or specify the
loop start address as a 32-bit constant (Section 10.9).
For example, using 32-bit constants:
R1 = #20000;
loop0(##start,R1) // LC0=20000, SA0=&start
...
The last instruction in the loop must always be expressed in assembly language as a packet (using
curly braces), even if it is the only instruction in the packet.
Nested hardware loops can specify the same instruction as the end of both the inner and outer
loops. For example:
// Sum the rows of a 100x200 matrix.
// Software pipeline the outer loop.
p0 = cmp.gt(R0,R0) // p0 = false
loop1(outer_start,#100)
outer_start:
{ if (p0) memw(R2++#4) = R0
p0 = cmp.eq(R0,R0) // p0 = true
R0 = #0
loop0(inner_start,#200) }
inner_start:
R3 = memw(R1++#4)
{ R0 = add(R0,R3) }:endloop0:endloop1
memw(R2++#4) = R0
Though endloopN behaves like a regular instruction (by implementing the loop test and branch),
it does not execute in any instruction slot, and does not count as an instruction in the packet.
Therefore a single instruction packet which is marked as a loop end can perform up to six
operations:
■ Four regular instructions (the normal limit for an instruction packet)
■ The endloop0 test and branch
■ The endloop1 test and branch
NOTE: The endloopN instruction is encoded in the instruction packet (Section 10.6).
In this example a hardware loop is set up with the loop count in R1, but if the value in R1 is zero a
software branch skips over the loop body.
After the loop end instruction of a hardware loop is executed, the Hexagon processor examines
the value in the corresponding loop count register:
■ If the value is greater than 1, the processor decrements the loop count register and performs
a zero-cycle branch to the loop start address.
■ If the value is less than or equal to 1, the processor resumes program execution at the
instruction immediately following the loop end instruction.
NOTE: Because nested hardware loops can share the same loop end instruction, the processor can
examine both loop count registers in a single operation.
foo:
{ R3 = R1
loop0(.kernel,#98) // Decrease loop count by 2
}
R1 = memw(R0++#4) // First prologue stage
{ R1 = memw(R0++#4) // Second prologue stage
R2 = mpyi(R1,R1)
}
.falign
.kernel:
{ R1 = memw(R0++#4) // Kernel
R2 = mpyi(R1,R1)
memw(R3++#4) = R2
}:endloop0
{ R2 = mpyi(R1,R1) // First epilogue stage
memw(R3++#4) = R2
}
memw(R3++#4) = R2 // Second epilogue stage
jumpr lr
In Table 8-2, the kernel section of the pipelined loop performs three iterations of the loop in
parallel:
■ The load for iteration N+2
■ The multiply for iteration N+1
■ The store for iteration N
One drawback to software pipelining is the extra code necessary for the prologue and epilogue
sections of a pipelined loop.
To address this issue, the Hexagon processor provides the spNloop0 instruction, where the “N” in
the instruction name indicates a digit in the range 1 to 3. For example:
P3 = sp2loop0(start,#10) // Set up pipelined loop
The spNloop0 instruction is a variant of the loop0 instruction: it sets up a normal hardware loop
using SA0 and LC0, but also performs the following additional operations:
■ When the spNloop0 instruction executes, it assigns the truth value false to the predicate
register P3.
foo:
{ // load safety assumed
P3 = sp2loop0(.kernel,#102) // Set up pipelined loop
R3 = R1
}
.falign
.kernel:
{ R1 = memw(R0++#4) // Kernel
R2 = mpyi(R1,R1)
if (P3) memw(R3++#4) = R2
}:endloop0
jumpr lr
NOTE: The count value that spNloop0 uses to control the P3 setting is stored in the user status register
USR.LPCFG.
■ The last packet in a hardware loop cannot contain any program flow instructions (including
jumps or calls).
■ The loop end packet in loop0 cannot contain any instruction that changes SA0 or LC0.
Similarly, the loop end packet in loop1 cannot contain any instruction that changes SA1 or
LC1.
■ The loop end packet in spNloop0 cannot contain any instruction that changes P3.
NOTE: SA1 and LC1 can be changed at the end of loop0, while SA0 and LC0 can be changed at the end of
loop1.
8.3.1 Jumps
Jump instructions change the program flow to a target address, which are specified by either a
register or a PC-relative immediate value. Jump instructions can be conditional based on the
value of a predicate expression.
8.3.2 Calls
Call instructions jump to subroutines. The instruction performs a jump to the target address and
also stores the return address in the link register LR.
The forms of call are functionally similar to jump instructions and include both PC-relative and
register indirect in both unconditional and conditional forms.
8.3.3 Returns
Return instructions return from a subroutine. The instruction performs an indirect jump to the
subroutine return address stored in link register LR.
Returns are implemented as jump register indirect instructions, and support both unconditional
and conditional forms.
NOTE: The link register LR is an alias of general register R31. Therefore subroutine returns can be
performed with the instruction jumpr R31.
NOTE: Such instructions use an extra word to store the 32-bit offset (Section 10.9).
The size of a PC-relative branch offset is expressed in assembly language by optionally prefixing
the target label with the symbol “##” or “#”:
■ “##” specifies that the assembler must use a 32-bit offset
■ “#” specifies that the assembler must not use a 32-bit offset
■ No “#” specifies that the assembler use a 32-bit offset only if necessary
For example:
jump ##label // 32-bit offset
call #label // Non 32-bit offset
jump label // Offset size determined by assembler
Speculative jumps require the programmer to specify a direction hint in the jump instruction,
indicating whether the conditional jump is expected.
The hint initializes the dynamic branch predictor of the Hexagon processor. Whenever the
predictor is wrong, the speculative jump instruction takes two cycles to execute instead of one
(due to a pipeline stall).
Hints can improve program performance by indicating how speculative jumps are expected to
execute over the course of a program: the more often the specified hint indicates how the
instruction actually executes, the better the performance.
Hints are expressed in assembly language by appending the suffix “:t” or “:nt” to the jump
instruction symbol. For example:
■ jump:t – The jump instruction is most often taken
■ jump:nt – The jump instruction is most often not taken
In addition to dot-new predicates, speculative jumps also accept conditional arithmetic
expressions (=0, !=0, >=0, <=0) involving the general register Rs.
NOTE: The hints :t and :nt interact with the predicate value to determine the instruction cycle count.
The register operands used in a compare jump are limited to R0 through R7 or R16 through R23
(Table 10-3).
The compare and jump instructions that are used in a compare jump are limited to the
instructions listed in Table 8-9. The compare can use predicate P0 or P1, while the jump must
specify the same predicate that is set in the compare.
A compare jump instruction is expressed in assembly source as two independent compare and
jump instructions in a packet. The assembler translates the two instructions into a single
compound instruction.
{
R0 = memw(R2+#8)
if (cmp.eq(R0.new,#0)) jump:nt target
}
NOTE: New-value compare jump instructions are assigned to instruction class NV, which can execute
only in Slot 0. The instruction that assigns the new value must execute in Slot 1, 2, or 3.
The source and target register operands in the register transfer are limited to R0 through R7 or
R16 through R23 (Table 10-3).
The target address in the jump is a scaled 9-bit PC-relative address value (as opposed to the 22-bit
value in the regular unconditional jump instruction).
A register transfer jump instruction is expressed in assembly source as two independent
instructions in a packet. The assembler translates the instructions into a single compound
instruction.
NOTE: If a call is ignored in a dual jump, the link register LR is not changed.
The hintjr instruction indicates that the program is about to execute a jumpr to the address
contained in the specified register.
NOTE: To prevent a stall, the hintjr instruction must execute at least 2 packets before the corresponding
jumpr instruction.
The hintjr instruction is not needed for jumpr instructions used as returns (Section 8.3.3),
because in this case the Hexagon processor automatically predicts the jump targets based on the
most recent nested call instructions.
8.9 Pauses
Pauses suspend the execution of a program for a period of time, and put it into low-power mode.
The program remains suspended for the duration specified in the instruction.
The pause instruction accepts an unsigned 8-bit immediate operand which specifies the pause
duration in terms of cycles. The maximum possible duration is 263 cycles (255+8).
Hexagon processor interrupts cause a program to exit the paused state before its specified
duration has elapsed.
The pause instruction is useful for implementing user-level low-power synchronization operations
(such as spin locks).
8.10 Exceptions
Exceptions are internally-generated disruptions to the program flow.
The Hexagon processor OS handles fatal exceptions by terminating the execution of the
application system. The user is responsible for fixing the problem and recompiling their
applications.
The error messages generated by exceptions include the following information to assist in
locating the problem:
■ Cause code – Hexadecimal value indicating the type of exception
■ User IP – PC value indicating the instruction executed when the exception occurred
■ Bad VA – Virtual address indicating the data accessed when the exception occurred
NOTE: The cause code, user IP, and Bad VA values are stored in the Hexagon processor system control
registers SSR[7:0], ELR, and BADVA respectively.
If multiple exceptions occur simultaneously, the exception with the lowest error code value has
the highest exception priority.
If a packet contains multiple loads, or a load and a store, and both operations have an exception
of any type, all slot 1 exceptions process before any slot 0 exception is processed.
The Hexagon processor can collect execution statistics on the applications it executes. The
statistics summarize the various types of Hexagon processor events that occurred while the
application was running.
Execution statistics are collected in hardware or software:
■ Statistics are collected in hardware with the performance monitor unit (PMU), which is
defined as part of the Hexagon processor architecture.
■ Statistics are collected in software using the Hexagon simulator. The simulator statistics are
presented in the same format used by the PMU.
Execution statistics are expressed in terms of processor events. This chapter defines the event
symbols, along with their associated numeric codes.
NOTE: Because the types of execution events vary across the Hexagon processor versions, different
types of statistics are collected for each version. This chapter lists the event symbols defined for
version V73.
10.1 Instructions
Hexagon processor instructions are encoded in a 32-bit instruction word. The instruction word
format varies according to the instruction type.
The instruction words contain two types of bit fields:
■ Common fields appear in every processor instruction, and are defined the same in all
instructions.
■ Instruction-specific fields appear only in some instructions, or vary in definition across the
instruction set.
NOTE: In some cases, instruction-specific fields encode instruction attributes other than the ones
described for the fields in Table 10-1.
Reserved bits
Some instructions contain reserved bits that do not currently encode instruction attributes.
Always set these bits to 0 to ensure compatibility with any future changes in the instruction
encoding.
NOTE: Reserved bits appear as ‘-’ characters in the instruction encoding tables.
10.2 Sub-instructions
To reduce code size, the Hexagon processor supports the encoding of certain pairs of instructions
in a single 32-bit container. Instructions encoded this way are sub-instructions, and the containers
are duplexes (Section 10.3).
Sub-instructions are limited to certain commonly-used instructions:
■ Arithmetic and logical operations
■ Register transfer
■ Loads and stores
■ Stack frame allocation/deallocation
■ Subroutine return
Table 10-2 lists the sub-instructions along with the group identifiers that encode them in
duplexes.
Sub-instructions can access only a subset of the general registers (R0 to R7, R16 to R23).
Table 10-3 lists the sub-instruction register encodings.
10.3 Duplexes
A duplex is encoded as a 32-bit instruction with bits [15:14] set to 00. The sub-instructions that
comprise a duplex are encoded as 13-bit fields in the duplex.
An instruction packet can contain one duplex and up to two other (non-duplex) instructions. The
duplex must always appear as the last word in a packet.
The sub-instructions in a duplex always execute in slot 0 and slot 1.
The duplex ICLASS field values that specify the group of each sub-instruction in a duplex are
shown in Table 10-5
1 The sub-instruction register and immediate fields are assumed to be 0 when performing this comparison.
For details on encoding the individual class types, see Chapter 11.
31 16
15 0
P P
The following examples show how to use the Parse field to encode instruction packets:
{ A ; B}
01 11 // Parse fields of instructions A,B
{ A ; B ; C}
01 01 11 // Parse fields of instructions A,B,C
{ A ; B ; C ; D}
01 01 01 11 // Parse fields of instructions A,B,C,D
The following examples show how to use the Parse field to encode loop packets:
{ A B}:endloop0
10 11 // Parse fields of instrs A,B
{ A B C}:endloop0
10 01 11 // Parse fields of instrs A,B,C
{ A B C D}:endloop0
10 01 01 11 // Parse fields of instrs A,B,C,D
{ A B C}:endloop1
01 10 11 // Parse fields of instrs A,B,C
{ A B C D}:endloop1
01 10 01 11 // Parse fields of instrs A,B,C,D
{ A B C}:endloop0:endloop1
10 10 11 // Parse fields of instrs A,B,C
{ A B C D}:endloop0:endloop1
10 10 01 11 // Parse fields of instrs A,B,C,D
NOTE: The scaled immediate value in the example above is represented notationally as #s4:2.
Scaled immediate values commonly encode address offsets that apply to data types of varying
size. For example, Table 10-8 shows how to use the byte offsets in immediate-with-offset
addressing mode that are stored as 11-bit scaled immediate values. This enables the offsets to
span the same range of data elements regardless of the data type.
A constant extender is encoded as a 32-bit instruction with the 4-bit ICLASS field set to 0 and the
2-bit Parse field set to its usual value (Section 10.5). The remaining 26 bits in the instruction word
store the data bits that are prepended to an operand as small as 6 bits to create a full 32-bit value.
Within a packet, a constant extender must be positioned immediately before the instruction that
it extends: in terms of memory addresses, the extender word must reside at address
(<instr_address> - 4).
The constant extender effectively serves as a prefix for an instruction: it does not execute in a
slot, nor does it consume any slot resources. All packets must contain four or fewer words, and
the constant extender occupies one word.
If the instruction operand to extend is longer than 6 bits, the overlapping bits in the base
instruction must be encoded as zeros. The value in the constant extender always supplies the
upper 26 bits.
The Regclass field in Table 10-10 lists the values to set bits [27:24] to in the instruction word to
identify the instruction as one that might include a constant extender.
NOTE: When the base instruction encodes two constant operands, the extended immediate is the one
specified in the table.
Constant extenders appear in disassembly listings as Hexagon instructions with the name
immext.
NOTE: If a constant extender is encoded in a packet for an instruction that does not accept a constant
extender, the execution result is undefined. The assembler normally ensures that only valid
constant extenders are generated.
Two methods exist for encoding a 32-bit absolute address in a load or store instruction:
1. For unconditional load/stores, the GP-relative load/store instruction is used. The assembler
encodes the absolute 32-bit address as follows:
❒ The upper 26 bits are encoded in a constant extender
❒ The lower 6 bits are encoded in the 6 operand bits contained in the GP-relative
instruction
In this case the 32-bit value encoded must be a plain address, and the value stored in the GP
register is ignored.
NOTE: When a constant extender is explicitly specified with a GP-relative load/store, the
processor ignores the value in GP and creates the effective address directly from the
32-bit constant value.
2. For conditional load/store instructions that have their base address encoded only by a 6-bit
immediate operand, a constant extender must be explicitly specified; otherwise, the
execution result is undefined. The assembler ensures that these instructions always include a
constant extender.
This case applies also to instructions that use the absolute-set addressing mode or absolute-
plus-register-offset addressing mode.
The immediate operands of certain instructions use scaled immediates (Section 10.8) to increase
their addressable range. When using constant extenders, scaled immediates are not scaled by the
processor. Instead, the assembler must encode the full 32-bit unscaled value as follows:
■ The upper 26 bits are encoded in the constant extender
■ The lower six 6 bits are encoded in the base instruction in the least-significant bit positions of
the immediate operand field.
■ Any overlapping bits in the base instruction are encoded as zeros.
When a jump/call has a constant extender, the resulting target address is forced to a 32-bit
alignment (bits 1:0 in the address are cleared by hardware). The resulting jump/call operation
never causes an alignment violation.
“ahead” is defined here as the instruction encoded at a lower memory address than the
consumer instruction, not counting empty slots or constant extenders. For example, the following
producer/consumer relationship is encoded with Nt[2:1] set to 01.
...
<producer instruction word>
<consumer constant extender word>
<consumer instruction word>
...
NOTE: Instructions with 64-bit register pair destinations cannot produce new-values. The assembler
flags this case with an error, as the result is undefined.
This chapter describes the instruction set for version 7 of the Hexagon processor.
The instructions are listed alphabetically within instruction categories. The following information
is provided for each instruction:
■ Instruction name
■ A brief description of the instruction
■ A high-level functional description (syntax and behavior) with possible operand types
■ Instruction class and slot information for grouping instructions in packets
■ C intrinsic functions that provide access to the instruction
■ Instruction encoding
11.1 ALU32
The ALU32 instruction class includes instructions that perform arithmetic and logical operations
on 32-bit data.
ALU32 instructions are executable on any slot.
Add
Add a source register either to another source register or to a signed 16-bit immediate value.
Store the result in destination register. Source and destination registers are 32 bits. If the result
overflows 32 bits, it wraps around. Optionally saturate result to a signed value between
0x80000000 and 0x7fffffff.
For 64-bit versions of this operation, see the XTYPE add instructions.
Syntax Behavior
Rd=add(Rs,#s16) apply_extension(#s);
Rd=Rs+#s;
Rd=add(Rs,Rt) Rd=Rs+Rt;
Rd=add(Rs,Rt):sat Rd=sat32(Rs+Rt);
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse d5
1 0 1 1 i i i i i i i s s s s s P P i i i i i i i i i d d d d d Rd=add(Rs,#s16)
ICLASS P MajOp MinOp s5 Parse t5 d5
1 1 1 1 0 0 1 1 0 0 0 s s s s s P P - t t t t t - - - d d d d d Rd=add(Rs,Rt)
1 1 1 1 0 1 1 0 0 1 0 s s s s s P P - t t t t t - - - d d d d d Rd=add(Rs,Rt):sat
Logical operations
Perform bitwise logical operations (AND, OR, XOR, NOT) either on two source registers or on a
source register and a signed 10-bit immediate value. Store result in destination register. Source
and destination registers are 32 bits.
For 64-bit versions of these operations, see the XTYPE logical instructions.
Syntax Behavior
Rd=and(Rs,#s10) apply_extension(#s);
Rd=Rs&#s;
Rd=and(Rs,Rt) Rd=Rs&Rt;
Rd=and(Rt,~Rs) Rd = (Rt & ~Rs);
Rd=not(Rs) Assembler mapped to: "Rd=sub(#-1,Rs)"
Rd=or(Rs,#s10) apply_extension(#s);
Rd=Rs|#s;
Rd=or(Rs,Rt) Rd=Rs|Rt;
Rd=or(Rt,~Rs) Rd = (Rt | ~Rs);
Rd=xor(Rs,Rt) Rd=Rs^Rt;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS Rs MajOp MinOp s5 Parse d5
0 1 1 1 0 1 1 0 0 0 i s s s s s P P i i i i i i i i i d d d d d Rd=and(Rs,#s10)
0 1 1 1 0 1 1 0 1 0 i s s s s s P P i i i i i i i i i d d d d d Rd=or(Rs,#s10)
ICLASS P MajOp MinOp s5 Parse t5 d5
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
1 1 1 1 0 0 0 1 0 0 0 s s s s s P P - t t t t t - - - d d d d d Rd=and(Rs,Rt)
1 1 1 1 0 0 0 1 0 0 1 s s s s s P P - t t t t t - - - d d d d d Rd=or(Rs,Rt)
1 1 1 1 0 0 0 1 0 1 1 s s s s s P P - t t t t t - - - d d d d d Rd=xor(Rs,Rt)
1 1 1 1 0 0 0 1 1 0 0 s s s s s P P - t t t t t - - - d d d d d Rd=and(Rt,~Rs)
1 1 1 1 0 0 0 1 1 0 1 s s s s s P P - t t t t t - - - d d d d d Rd=or(Rt,~Rs)
Negate
Perform arithmetic negation on a source register. Store result in destination register. Source and
destination registers are 32 bits.
For 64-bit and saturating versions of this instruction, see the XTYPE-class negate instructions.
Syntax Behavior
Rd=neg(Rs) Assembler mapped to: "Rd=sub(#0,Rs)"
Class: N/A
Intrinsics
NOP
Perform no operation. This instruction is used for padding and alignment.
Within a packet, it can be positioned in any slot 0 through 3.
Syntax Behavior
nop
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS Rs MajOp Parse
0 1 1 1 1 1 1 1 - - - - - - - - P P - - - - - - - - - - - - - - nop
Subtract
Subtract a source register from either another source register or from a signed 10-bit immediate
value. Store the result in the destination register. Source and destination registers are 32 bits. If
the result underflows 32 bits, it wraps around. Optionally saturate result to a signed value
between 0x8000_0000 and 0x7fff_ffff.
For 64-bit versions of this operation, see the XTYPE subtract instructions.
Syntax Behavior
Rd=sub(#s10,Rs) apply_extension(#s);
Rd=#s-Rs;
Rd=sub(Rt,Rs) Rd=Rt-Rs;
Rd=sub(Rt,Rs):sat Rd=sat32(Rt - Rs);
Notes
■ If saturation occurs during execution of this instruction (a result is clamped to either
maximum or minimum values), the OVF bit in the status register is set. OVF remains set until
explicitly cleared by a transfer to the status register.
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS Rs MajOp MinOp s5 Parse d5
0 1 1 1 0 1 1 0 0 1 i s s s s s P P i i i i i i i i i d d d d d Rd=sub(#s10,Rs)
ICLASS P MajOp MinOp s5 Parse t5 d5
1 1 1 1 0 0 1 1 0 0 1 s s s s s P P - t t t t t - - - d d d d d Rd=sub(Rt,Rs)
1 1 1 1 0 1 1 0 1 1 0 s s s s s P P - t t t t t - - - d d d d d Rd=sub(Rt,Rs):sat
Sign extend
Sign-extend the least-significant byte or halfword from the source register and place the 32-bit
result in the destination register.
Rd=sxth(Rs) Rd=sxtb(Rs)
Rs Rs
Sign-extend Rd Sign-extend Rd
Syntax Behavior
Rd=sxtb(Rs) Rd = sxt8->32(Rs);
Rd=sxth(Rs) Rd = sxt16->32(Rs);
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS Rs MajOp MinOp s5 Parse C d5
0 1 1 1 0 0 0 0 1 0 1 s s s s s P P 0 - - - - - - - - d d d d d Rd=sxtb(Rs)
0 1 1 1 0 0 0 0 1 1 1 s s s s s P P 0 - - - - - - - - d d d d d Rd=sxth(Rs)
Transfer immediate
Assign an immediate value to a 32-bit destination register.
Two types of assignment are supported. The first sign-extends a 16-bit signed immediate value to
32 bits. The second assigns a 16-bit unsigned immediate value to either the upper or lower 16 bits
of the destination register, leaving the other 16 bits unchanged.
Rd=#s16
Sign-Extend 16-bit immediate
Rx.H=#u16
16-bit immediate Unchanged
Rx.L=#u16
Unchanged 16-bit immediate
Syntax Behavior
Rd=#s16 apply_extension(#s);
Rd=#s;
Rdd=#s8 if ("#s8<0") {
Assembler mapped to: "Rdd=combine(#-1,#s8)";
} else {
Assembler mapped to: "Rdd=combine(#0,#s8)";
}
Rx.[HL]=#u16 Rx.h[01]=#u;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS Rs MajOp MinOp x5 Parse
0 1 1 1 0 0 0 1 i i 1 x x x x x P P i i i i i i i i i i i i i i Rx.L=#u16
0 1 1 1 0 0 1 0 i i 1 x x x x x P P i i i i i i i i i i i i i i Rx.H=#u16
ICLASS Rs MajOp MinOp Parse d5
0 1 1 1 1 0 0 0 i i - i i i i i P P i i i i i i i i i d d d d d Rd=#s16
Transfer register
Transfer a source register to a destination register. Source and destination registers are either 32
bits or 64 bits.
Syntax Behavior
Rd=Rs Rd=Rs;
Rdd=Rss Assembler mapped to:
"Rdd=combine(Rss.H32,Rss.L32)"
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS Rs MajOp MinOp s5 Parse C d5
0 1 1 1 0 0 0 0 0 1 1 s s s s s P P 0 - - - - - - - - d d d d d Rd=Rs
Syntax Behavior
Rd=vaddh(Rs,Rt)[:sat] for (i=0;i<2;i++) {
Rd.h[i]=[sat16](Rs.h[i]+Rt.h[i]);
}
Rd=vadduh(Rs,Rt):sat for (i=0;i<2;i++) {
Rd.h[i]=usat16(Rs.uh[i]+Rt.uh[i]);
}
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS P MajOp MinOp s5 Parse t5 d5
1 1 1 1 0 1 1 0 0 0 0 s s s s s P P - t t t t t - - - d d d d d Rd=vaddh(Rs,Rt)
1 1 1 1 0 1 1 0 0 0 1 s s s s s P P - t t t t t - - - d d d d d Rd=vaddh(Rs,Rt):sat
1 1 1 1 0 1 1 0 0 1 1 s s s s s P P - t t t t t - - - d d d d d Rd=vadduh(Rs,Rt):sat
Syntax Behavior
Rd=vavgh(Rs,Rt) for (i=0;i<2;i++) {
Rd.h[i]=((Rs.h[i]+Rt.h[i])>>1);
}
Rd=vavgh(Rs,Rt):rnd for (i=0;i<2;i++) {
Rd.h[i]=((Rs.h[i]+Rt.h[i]+1)>>1);
}
Rd=vnavgh(Rt,Rs) for (i=0;i<2;i++) {
Rd.h[i]=((Rt.h[i]-Rs.h[i])>>1);
}
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS P MajOp MinOp s5 Parse t5 d5
1 1 1 1 0 1 1 1 - 0 0 s s s s s P P - t t t t t - - - d d d d d Rd=vavgh(Rs,Rt)
1 1 1 1 0 1 1 1 - 0 1 s s s s s P P - t t t t t - - - d d d d d Rd=vavgh(Rs,Rt):rnd
1 1 1 1 0 1 1 1 - 1 1 s s s s s P P - t t t t t - - - d d d d d Rd=vnavgh(Rt,Rs)
Syntax Behavior
Rd=vsubh(Rt,Rs)[:sat] for (i=0;i<2;i++) {
Rd.h[i]=[sat16](Rt.h[i]-Rs.h[i]);
}
Rd=vsubuh(Rt,Rs):sat for (i=0;i<2;i++) {
Rd.h[i]=usat16(Rt.uh[i]-Rs.uh[i]);
}
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS P MajOp MinOp s5 Parse t5 d5
1 1 1 1 0 1 1 0 1 0 0 s s s s s P P - t t t t t - - - d d d d d Rd=vsubh(Rt,Rs)
1 1 1 1 0 1 1 0 1 0 1 s s s s s P P - t t t t t - - - d d d d d Rd=vsubh(Rt,Rs):sat
1 1 1 1 0 1 1 0 1 1 1 s s s s s P P - t t t t t - - - d d d d d Rd=vsubuh(Rt,Rs):sat
Zero extend
Zero-extend the least significant byte or halfword from Rs and place the 32-bit result in Rd.
Rd=zxth(Rs) Rd=zxtb(Rs)
Rs Rs
0x0000 Rd 0x000000 Rd
Syntax Behavior
Rd=zxtb(Rs) Assembler mapped to: "Rd=and(Rs,#255)"
Rd=zxth(Rs) Rd = zxt16->32(Rs);
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS Rs MajOp MinOp s5 Parse C d5
0 1 1 1 0 0 0 0 1 1 0 s s s s s P P 0 - - - - - - - - d d d d d Rd=zxth(Rs)
Rd=combine(Rt.[HL],Rs.[HL])
Mux Mux Rs Rt
Rd Rdd
Syntax Behavior
Rd=combine(Rt.[HL],Rs.[HL]) Rd = (Rt.uh[01]<<16) | Rs.uh[01];
Rdd=combine(#s8,#S8) apply_extension(#s);
Rdd.w[0]=#S;
Rdd.w[1]=#s;
Rdd=combine(#s8,#U6) apply_extension(#U);
Rdd.w[0]=#U;
Rdd.w[1]=#s;
Rdd=combine(#s8,Rs) apply_extension(#s);
Rdd.w[0]=Rs;
Rdd.w[1]=#s;
Rdd=combine(Rs,#s8) apply_extension(#s);
Rdd.w[0]=#s;
Rdd.w[1]=Rs;
Syntax Behavior
Rdd=combine(Rs,Rt) Rdd.w[0]=Rt;
Rdd.w[1]=Rs;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS Rs MajOp MinOp s5 Parse d5
0 1 1 1 0 0 1 1 - 0 0 s s s s s P P 1 i i i i i i i i d d d d d Rdd=combine(Rs,#s8)
0 1 1 1 0 0 1 1 - 0 1 s s s s s P P 1 i i i i i i i i d d d d d Rdd=combine(#s8,Rs)
ICLASS Rs MajOp MinOp Parse d5
0 1 1 1 1 1 0 0 0 I I I I I I I P P I i i i i i i i i d d d d d Rdd=combine(#s8,#S8)
0 1 1 1 1 1 0 0 1 - - I I I I I P P I i i i i i i i i d d d d d Rdd=combine(#s8,#U6)
ICLASS P MajOp MinOp s5 Parse t5 d5
1 1 1 1 0 0 1 1 1 0 0 s s s s s P P - t t t t t - - - d d d d d Rd=combine(Rt.H,Rs.H)
1 1 1 1 0 0 1 1 1 0 1 s s s s s P P - t t t t t - - - d d d d d Rd=combine(Rt.H,Rs.L)
1 1 1 1 0 0 1 1 1 1 0 s s s s s P P - t t t t t - - - d d d d d Rd=combine(Rt.L,Rs.H)
1 1 1 1 0 0 1 1 1 1 1 s s s s s P P - t t t t t - - - d d d d d Rd=combine(Rt.L,Rs.L)
1 1 1 1 0 1 0 1 0 - - s s s s s P P - t t t t t - - - d d d d d Rdd=combine(Rs,Rt)
Mux
Select between two source registers based on the least-significant bit of a predicate register. If the
bit is 1, transfer the first source register to the destination register; otherwise, transfer the second
source register. Source and destination registers are 32 bits.
In a variant of mux, signed 8-bit immediate values are used instead of registers for either or both
source operands.
For 64-bit versions of this instruction, see the XTYPE vmux (Vector mux) instruction.
Syntax Behavior
Rd=mux(Pu,#s8,#S8) PREDUSE_TIMING;
apply_extension(#s);
Rd = (Pu[0] ? #s : #S);
Rd=mux(Pu,#s8,Rs) PREDUSE_TIMING;
apply_extension(#s);
Rd = (Pu[0] ? #s : Rs);
Rd=mux(Pu,Rs,#s8) PREDUSE_TIMING;
apply_extension(#s);
Rd = (Pu[0] ? Rs : #s);
Rd=mux(Pu,Rs,Rt) PREDUSE_TIMING;
Rd = (Pu[0] ? Rs : Rt);
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS Rs MajOp u2 s5 Parse d5
0 1 1 1 0 0 1 1 0 u u s s s s s P P 0 i i i i i i i i d d d d d Rd=mux(Pu,Rs,#s8)
0 1 1 1 0 0 1 1 1 u u s s s s s P P 0 i i i i i i i i d d d d d Rd=mux(Pu,#s8,Rs)
ICLASS Rs u1 Parse d5
0 1 1 1 1 0 1 u u I I I I I I I P P I i i i i i i i i d d d d d Rd=mux(Pu,#s8,#S8)
ICLASS P MajOp s5 Parse t5 u2 d5
1 1 1 1 0 1 0 0 - - - s s s s s P P - t t t t t - u u d d d d d Rd=mux(Pu,Rs,Rt)
Shift word by 16
ASLH performs an arithmetic left shift of the 32-bit source register by 16 bits (one halfword). The
lower 16 bits of the destination are zero-filled.
Rs
0x0000 Rd
ASRH performs an arithmetic right shift of the 32-bit source register by 16 bits (one halfword).
The upper 16 bits of the destination are sign-extended.
Rs
Sign-extend Rd
Syntax Behavior
Rd=aslh(Rs) Rd=Rs<<16;
Rd=asrh(Rs) Rd=Rs>>16;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS Rs MajOp MinOp s5 Parse C d5
0 1 1 1 0 0 0 0 0 0 0 s s s s s P P 0 - - - - - - - - d d d d d Rd=aslh(Rs)
0 1 1 1 0 0 0 0 0 0 1 s s s s s P P 0 - - - - - - - - d d d d d Rd=asrh(Rs)
Rdd
Syntax Behavior
Rdd=packhl(Rs,Rt) Rdd.h[0]=Rt.h[0];
Rdd.h[1]=Rs.h[0];
Rdd.h[2]=Rt.h[1];
Rdd.h[3]=Rs.h[1];
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS P MajOp MinOp s5 Parse t5 d5
1 1 1 1 0 1 0 1 1 - - s s s s s P P - t t t t t - - - d d d d d Rdd=packhl(Rs,Rt)
Conditional add
If the least-significant bit of predicate Pu is set, add a 32-bit source register to either another
register or an immediate value. The result is placed in 32-bit destination register. If the predicate
is false, the instruction does nothing.
Syntax Behavior
if ([!]Pu[.new]) if([!]Pu[.new][0]){
Rd=add(Rs,#s8) apply_extension(#s);
Rd=Rs+#s;
} else {
NOP;
}
if ([!]Pu[.new]) if([!]Pu[.new][0]){
Rd=add(Rs,Rt) Rd=Rs+Rt;
} else {
NOP;
}
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
d if
(!Pu.new)
0 1 1 1 0 1 0 0 1 u u s s s s s P P 1 i i i i i i i i d d d d
Rd=add(Rs,#s8)
Rs
0x0000 Rd
The asrh instruction performs an arithmetic right shift of the 32-bit source register by 16 bits (one
halfword). The upper 16 bits of the destination are sign-extended.
Rs
Sign-extend Rd
Syntax Behavior
if ([!]Pu[.new]) Rd=aslh(Rs) if([!]Pu[.new][0]){
Rd=Rs<<16;
} else {
NOP;
}
if ([!]Pu[.new]) Rd=asrh(Rs) if([!]Pu[.new][0]){
Rd=Rs>>16;
} else {
NOP;
}
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS Rs MajOp MinOp s5 Parse C S dn u2 d5
0 1 1 1 0 0 0 0 0 0 0 s s s s s P P 1 - 0 0 u u - - - d d d d d if (Pu) Rd=aslh(Rs)
0 1 1 1 0 0 0 0 0 0 0 s s s s s P P 1 - 0 1 u u - - - d d d d d if (Pu.new) Rd=aslh(Rs)
0 1 1 1 0 0 0 0 0 0 0 s s s s s P P 1 - 1 0 u u - - - d d d d d if (!Pu) Rd=aslh(Rs)
0 1 1 1 0 0 0 0 0 0 0 s s s s s P P 1 - 1 1 u u - - - d d d d d if (!Pu.new) Rd=aslh(Rs)
0 1 1 1 0 0 0 0 0 0 1 s s s s s P P 1 - 0 0 u u - - - d d d d d if (Pu) Rd=asrh(Rs)
0 1 1 1 0 0 0 0 0 0 1 s s s s s P P 1 - 0 1 u u - - - d d d d d if (Pu.new) Rd=asrh(Rs)
0 1 1 1 0 0 0 0 0 0 1 s s s s s P P 1 - 1 0 u u - - - d d d d d if (!Pu) Rd=asrh(Rs)
0 1 1 1 0 0 0 0 0 0 1 s s s s s P P 1 - 1 1 u u - - - d d d d d if (!Pu.new) Rd=asrh(Rs)
Conditional combine
If the least-significant bit of predicate Pu is set, the most-significant word of destination Rdd is
taken from the first source register Rs, while the least-significant word is taken from the second
source register Rt. If the predicate is false, this instruction does nothing.
Syntax Behavior
if ([!]Pu[.new]) if ([!]Pu[.new][0]) {
Rdd=combine(Rs,Rt) Rdd.w[0]=Rt;
Rdd.w[1]=Rs;
} else {
NOP;
}
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
1 1 1 1 1 1 0 1 0 0 0 s s s s s P P 0 t t t t t 1 u u d d d d d if (!Pu)
Rdd=combine(Rs,Rt)
if (Pu.new)
1 1 1 1 1 1 0 1 0 0 0 s s s s s P P 1 t t t t t 0 u u d d d d d Rdd=combine(Rs,Rt)
1 1 1 1 1 1 0 1 0 0 0 s s s s s P P 1 t t t t t 1 u u d d d d d if (!Pu.new)
Rdd=combine(Rs,Rt)
Syntax Behavior
if ([!]Pu[.new]) if([!]Pu[.new][0]){
Rd=and(Rs,Rt) Rd=Rs&Rt;
} else {
NOP;
}
if ([!]Pu[.new]) if([!]Pu[.new][0]){
Rd=or(Rs,Rt) Rd=Rs|Rt;
} else {
NOP;
}
if ([!]Pu[.new]) if([!]Pu[.new][0]){
Rd=xor(Rs,Rt) Rd=Rs^Rt;
} else {
NOP;
}
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
D
ICLASS P MajOp MinOp s5 Parse N t5 PS u2 d5
1 1 1 1 1 0 0 1 - 0 0 s s s s s P P 0 t t t t t 0 u u d d d d d if (Pu) Rd=and(Rs,Rt)
1 1 1 1 1 0 0 1 - 0 0 s s s s s P P 0 t t t t t 1 u u d d d d d if (!Pu) Rd=and(Rs,Rt)
1 1 1 1 1 0 0 1 - 0 0 s s s s s P P 1 t t t t t 0 u u d d d d d if (Pu.new) Rd=and(Rs,Rt)
1 1 1 1 1 0 0 1 - 0 0 s s s s s P P 1 t t t t t 1 u u d d d d d if (!Pu.new) Rd=and(Rs,Rt)
1 1 1 1 1 0 0 1 - 0 1 s s s s s P P 0 t t t t t 0 u u d d d d d if (Pu) Rd=or(Rs,Rt)
1 1 1 1 1 0 0 1 - 0 1 s s s s s P P 0 t t t t t 1 u u d d d d d if (!Pu) Rd=or(Rs,Rt)
1 1 1 1 1 0 0 1 - 0 1 s s s s s P P 1 t t t t t 0 u u d d d d d if (Pu.new) Rd=or(Rs,Rt)
1 1 1 1 1 0 0 1 - 0 1 s s s s s P P 1 t t t t t 1 u u d d d d d if (!Pu.new) Rd=or(Rs,Rt)
1 1 1 1 1 0 0 1 - 1 1 s s s s s P P 0 t t t t t 0 u u d d d d d if (Pu) Rd=xor(Rs,Rt)
1 1 1 1 1 0 0 1 - 1 1 s s s s s P P 0 t t t t t 1 u u d d d d d if (!Pu) Rd=xor(Rs,Rt)
1 1 1 1 1 0 0 1 - 1 1 s s s s s P P 1 t t t t t 0 u u d d d d d if (Pu.new) Rd=xor(Rs,Rt)
1 1 1 1 1 0 0 1 - 1 1 s s s s s P P 1 t t t t t 1 u u d d d d d if (!Pu.new) Rd=xor(Rs,Rt)
Conditional subtract
If the least-significant bit of predicate Pu is set, subtract a 32-bit source register Rt from register
Rs. The result is placed in a 32-bit destination register. If the predicate is false, the instruction
does nothing.
Syntax Behavior
if ([!]Pu[.new]) if([!]Pu[.new][0]){
Rd=sub(Rt,Rs) Rd=Rt-Rs;
} else {
NOP;
}
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Rs Rs
Sign-extend Rd Sign-extend Rd
Syntax Behavior
if ([!]Pu[.new]) Rd=sxtb(Rs) if([!]Pu[.new][0]){
Rd=sxt8->32(Rs);
} else {
NOP;
}
if ([!]Pu[.new]) Rd=sxth(Rs) if([!]Pu[.new][0]){
Rd=sxt16->32(Rs);
} else {
NOP;
}
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS Rs MajOp MinOp s5 Parse C S dn u2 d5
0 1 1 1 0 0 0 0 1 0 1 s s s s s P P 1 - 0 0 u u - - - d d d d d if (Pu) Rd=sxtb(Rs)
0 1 1 1 0 0 0 0 1 0 1 s s s s s P P 1 - 0 1 u u - - - d d d d d if (Pu.new) Rd=sxtb(Rs)
0 1 1 1 0 0 0 0 1 0 1 s s s s s P P 1 - 1 0 u u - - - d d d d d if (!Pu) Rd=sxtb(Rs)
0 1 1 1 0 0 0 0 1 0 1 s s s s s P P 1 - 1 1 u u - - - d d d d d if (!Pu.new) Rd=sxtb(Rs)
0 1 1 1 0 0 0 0 1 1 1 s s s s s P P 1 - 0 0 u u - - - d d d d d if (Pu) Rd=sxth(Rs)
0 1 1 1 0 0 0 0 1 1 1 s s s s s P P 1 - 0 1 u u - - - d d d d d if (Pu.new) Rd=sxth(Rs)
0 1 1 1 0 0 0 0 1 1 1 s s s s s P P 1 - 1 0 u u - - - d d d d d if (!Pu) Rd=sxth(Rs)
0 1 1 1 0 0 0 0 1 1 1 s s s s s P P 1 - 1 1 u u - - - d d d d d if (!Pu.new) Rd=sxth(Rs)
Conditional transfer
If the LSB of predicate Pu is set, transfer register Rs or a signed immediate into destination Rd. If
the predicate is false, this instruction does nothing.
Syntax Behavior
if ([!]Pu[.new]) apply_extension(#s);
Rd=#s12 if ([!]Pu[.new][0]) Rd=#s;
else NOP;
if ([!]Pu[.new]) Rd=Rs Assembler mapped to: "if ([!]Pu[.new]) Rd=add(Rs,#0)"
if ([!]Pu[.new]) Assembler mapped to: "if ([!]Pu[.new])
Rdd=Rss Rdd=combine(Rss.H32,Rss.L32)"
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
D
ICLASS Rs MajOp PS u2 Parse N d5
0 1 1 1 1 1 1 0 0 u u 0 i i i i P P 0 i i i i i i i i d d d d d if (Pu) Rd=#s12
0 1 1 1 1 1 1 0 0 u u 0 i i i i P P 1 i i i i i i i i d d d d d if (Pu.new) Rd=#s12
0 1 1 1 1 1 1 0 1 u u 0 i i i i P P 0 i i i i i i i i d d d d d if (!Pu) Rd=#s12
0 1 1 1 1 1 1 0 1 u u 0 i i i i P P 1 i i i i i i i i d d d d d if (!Pu.new) Rd=#s12
Rs Rs
0x0000 Rd 0x000000 Rd
Syntax Behavior
if ([!]Pu[.new]) Rd=zxtb(Rs) if([!]Pu[.new][0]){
Rd=zxt8->32(Rs);
} else {
NOP;
}
if ([!]Pu[.new]) Rd=zxth(Rs) if([!]Pu[.new][0]){
Rd=zxt16->32(Rs);
} else {
NOP;
}
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS Rs MajOp MinOp s5 Parse C S dn u2 d5
0 1 1 1 0 0 0 0 1 0 0 s s s s s P P 1 - 0 0 u u - - - d d d d d if (Pu) Rd=zxtb(Rs)
0 1 1 1 0 0 0 0 1 0 0 s s s s s P P 1 - 0 1 u u - - - d d d d d if (Pu.new) Rd=zxtb(Rs)
0 1 1 1 0 0 0 0 1 0 0 s s s s s P P 1 - 1 0 u u - - - d d d d d if (!Pu) Rd=zxtb(Rs)
0 1 1 1 0 0 0 0 1 0 0 s s s s s P P 1 - 1 1 u u - - - d d d d d if (!Pu.new) Rd=zxtb(Rs)
0 1 1 1 0 0 0 0 1 1 0 s s s s s P P 1 - 0 0 u u - - - d d d d d if (Pu) Rd=zxth(Rs)
0 1 1 1 0 0 0 0 1 1 0 s s s s s P P 1 - 0 1 u u - - - d d d d d if (Pu.new) Rd=zxth(Rs)
0 1 1 1 0 0 0 0 1 1 0 s s s s s P P 1 - 1 0 u u - - - d d d d d if (!Pu) Rd=zxth(Rs)
0 1 1 1 0 0 0 0 1 1 0 s s s s s P P 1 - 1 1 u u - - - d d d d d if (!Pu.new) Rd=zxth(Rs)
Compare
The register form compares two 32-bit registers for unsigned greater than, greater than, or equal.
The immediate form compares a register against a signed or unsigned immediate value. The 8-bit
predicate register Pd is set to all 1's or all 0's depending on the result. For 64-bit versions of this
instruction, see the XTYPE compare instructions.
Syntax Behavior
Pd=[!]cmp.eq(Rs,#s10) apply_extension(#s);
Pd=Rs[!]=#s ? 0xff : 0x00;
Pd=[!]cmp.eq(Rs,Rt) Pd=Rs[!]=Rt ? 0xff : 0x00;
Pd=[!]cmp.gt(Rs,#s10) apply_extension(#s);
Pd=Rs<=#s ? 0xff : 0x00;
Pd=[!]cmp.gt(Rs,Rt) Pd=Rs<=Rt ? 0xff : 0x00;
Pd=[!]cmp.gtu(Rs,#u9) apply_extension(#u);
Pd=Rs.uw[0]<=#u.uw[0] ? 0xff : 0x00;
Pd=[!]cmp.gtu(Rs,Rt) Pd=Rs.uw[0]<=Rt.uw[0] ? 0xff : 0x00;
Pd=cmp.ge(Rs,#s8) Assembler mapped to: "Pd=cmp.gt(Rs,#s8-1)"
Pd=cmp.geu(Rs,#u8) if ("#u8==0") {
Assembler mapped to:
"Pd=cmp.eq(Rs,Rs)";
} else {
Assembler mapped to:
"Pd=cmp.gtu(Rs,#u8-1)";
}
Pd=cmp.lt(Rs,Rt) Assembler mapped to: "Pd=cmp.gt(Rt,Rs)"
Pd=cmp.ltu(Rs,Rt) Assembler mapped to: "Pd=cmp.gtu(Rt,Rs)"
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS Rs MajOp MinOp s5 Parse d2
0 1 1 1 0 1 0 1 0 0 i s s s s s P P i i i i i i i i i 0 0 0 d d Pd=cmp.eq(Rs,#s10)
0 1 1 1 0 1 0 1 0 0 i s s s s s P P i i i i i i i i i 1 0 0 d d Pd=!cmp.eq(Rs,#s10)
0 1 1 1 0 1 0 1 0 1 i s s s s s P P i i i i i i i i i 0 0 0 d d Pd=cmp.gt(Rs,#s10)
0 1 1 1 0 1 0 1 0 1 i s s s s s P P i i i i i i i i i 1 0 0 d d Pd=!cmp.gt(Rs,#s10)
0 1 1 1 0 1 0 1 1 0 0 s s s s s P P i i i i i i i i i 0 0 0 d d Pd=cmp.gtu(Rs,#u9)
0 1 1 1 0 1 0 1 1 0 0 s s s s s P P i i i i i i i i i 1 0 0 d d Pd=!cmp.gtu(Rs,#u9)
ICLASS P MajOp MinOp s5 Parse t5 d2
1 1 1 1 0 0 1 0 - 0 0 s s s s s P P - t t t t t - - - 0 0 0 d d Pd=cmp.eq(Rs,Rt)
1 1 1 1 0 0 1 0 - 0 0 s s s s s P P - t t t t t - - - 1 0 0 d d Pd=!cmp.eq(Rs,Rt)
1 1 1 1 0 0 1 0 - 1 0 s s s s s P P - t t t t t - - - 0 0 0 d d Pd=cmp.gt(Rs,Rt)
1 1 1 1 0 0 1 0 - 1 0 s s s s s P P - t t t t t - - - 1 0 0 d d Pd=!cmp.gt(Rs,Rt)
1 1 1 1 0 0 1 0 - 1 1 s s s s s P P - t t t t t - - - 0 0 0 d d Pd=cmp.gtu(Rs,Rt)
1 1 1 1 0 0 1 0 - 1 1 s s s s s P P - t t t t t - - - 1 0 0 d d Pd=!cmp.gtu(Rs,Rt)
Syntax Behavior
Rd=[!]cmp.eq(Rs,#s8) apply_extension(#s);
Rd=(Rs[!]=#s);
Rd=[!]cmp.eq(Rs,Rt) Rd=(Rs[!]=Rt);
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS Rs MajOp MinOp s5 Parse d5
0 1 1 1 0 0 1 1 - 1 0 s s s s s P P 1 i i i i i i i i d d d d d Rd=cmp.eq(Rs,#s8)
0 1 1 1 0 0 1 1 - 1 1 s s s s s P P 1 i i i i i i i i d d d d d Rd=!cmp.eq(Rs,#s8)
ICLASS P MajOp MinOp s5 Parse t5 d5
1 1 1 1 0 0 1 1 0 1 0 s s s s s P P - t t t t t - - - d d d d d Rd=cmp.eq(Rs,Rt)
1 1 1 1 0 0 1 1 0 1 1 s s s s s P P - t t t t t - - - d d d d d Rd=!cmp.eq(Rs,Rt)
11.2 CR
The CR instruction class includes instructions which manage control registers, including hardware
looping, modulo addressing, and status flags.
CR instructions are executable on slot 3.
Class: N/A
Notes
■ This instruction cannot be grouped in a packet with any program flow instructions.
■ The Next PC value is the address immediately following the last instruction in the packet
containing this instruction.
■ The PC value is the address of the start of the packet
Syntax Behavior
Pd=[!]fastcorner9(Ps,Pt) PREDUSE_TIMING;
tmp.h[0]=(Ps<<8)|Pt;
tmp.h[1]=(Ps<<8)|Pt;
for (i = 1; i < 9; i++) {
tmp &= tmp >> 1;
}
Pd = tmp == 0 ? 0xff : 0x00;
Notes
■ This instruction may execute on either slot2 or slot3, even though it is a CR-type
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS sm s2 Parse t2 d2
0 1 1 0 1 0 1 1 0 0 0 0 - - s s P P 1 - - - t t 1 - - 1 - - d d Pd=fastcorner9(Ps,Pt)
0 1 1 0 1 0 1 1 0 0 0 1 - - s s P P 1 - - - t t 1 - - 1 - - d d Pd=!fastcorner9(Ps,Pt)
Syntax Behavior
Pd=all8(Ps) PREDUSE_TIMING;
Pd = (Ps == 0xff ? 0xff : 0x00);
Pd=any8(Ps) PREDUSE_TIMING;
Pd = (Ps ? 0xff : 0x00);
Notes
■ This instruction may execute on either slot2 or slot3, even though it is a CR-type
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS sm s2 Parse d2
0 1 1 0 1 0 1 1 1 0 0 0 - - s s P P 0 - - - - - - - - - - - d d Pd=any8(Ps)
0 1 1 0 1 0 1 1 1 0 1 0 - - s s P P 0 - - - - - - - - - - - d d Pd=all8(Ps)
Looping instructions
loopN is a single instruction which sets up a hardware loop. The N in the instruction name
indicates the set of loop registers to use. Loop0 is the innermost loop, while loop1 is the outer
loop. The loopN instruction first sets the start address (SA) register based on a PC-relative
immediate add. The relative immediate is added to the PC and stored in SA. The loop count (LC)
register is set to either an unsigned immediate or to a register value.
Syntax Behavior
loop0(#r7:2,#U10) apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
SA0=PC+#r;
LC0=#U;
USR.LPCFG=0;
loop0(#r7:2,Rs) apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
SA0=PC+#r;
LC0=Rs;
USR.LPCFG=0;
loop1(#r7:2,#U10) apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
SA1=PC+#r;
LC1=#U;
loop1(#r7:2,Rs) apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
SA1=PC+#r;
LC1=Rs;
Class: CR (slot 3)
Notes
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS sm s5 Parse
0 1 1 0 0 0 0 0 0 0 0 s s s s s P P - i i i i i - - - i i - - - loop0(#r7:2,Rs)
0 1 1 0 0 0 0 0 0 0 1 s s s s s P P - i i i i i - - - i i - - - loop1(#r7:2,Rs)
ICLASS sm Parse
0 1 1 0 1 0 0 1 0 0 0 I I I I I P P - i i i i i I I I i i - I I loop0(#r7:2,#U10)
0 1 1 0 1 0 0 1 0 0 1 I I I I I P P - i i i i i I I I i i - I I loop1(#r7:2,#U10)
Add to PC
Add an immediate value to the program counter (PC) and place the result in a destination register.
This instruction is typically used with a constant extender to add a 32-bit immediate value to PC.
Syntax Behavior
Rd=add(pc,#u6) Rd=PC+apply_extension(#u);
Class: CR (slot 3)
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS sm Parse d5
0 1 1 0 1 0 1 0 0 1 0 0 1 0 0 1 P P - i i i i i i - - d d d d d Rd=add(pc,#u6)
Syntax Behavior
p3=sp1loop0(#r7:2,#U10) apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
SA0=PC+#r;
LC0=#U;
USR.LPCFG=1;
P3=0;
p3=sp1loop0(#r7:2,Rs) apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
SA0=PC+#r;
LC0=Rs;
USR.LPCFG=1;
P3=0;
p3=sp2loop0(#r7:2,#U10) apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
SA0=PC+#r;
LC0=#U;
USR.LPCFG=2;
P3=0;
p3=sp2loop0(#r7:2,Rs) apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
SA0=PC+#r;
LC0=Rs;
USR.LPCFG=2;
P3=0;
p3=sp3loop0(#r7:2,#U10) apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
SA0=PC+#r;
LC0=#U;
USR.LPCFG=3;
P3=0;
p3=sp3loop0(#r7:2,Rs) apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
SA0=PC+#r;
LC0=Rs;
USR.LPCFG=3;
P3=0;
Class: CR (slot 3)
Notes
■ The predicate generated by this instruction can not be used as a .new predicate, nor can it be
automatically ANDed with another predicate.
■ This instruction cannot execute in the last address of a hardware loop.
■ The Next PC value is the address immediately following the last instruction in the packet
containing this instruction.
■ The PC value is the address of the start of the packet
■ A PC-relative address is formed by taking the decoded immediate value and adding it to the
current PC value.
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS sm s5 Parse
0 1 1 0 0 0 0 0 1 0 1 s s s s s P P - i i i i i - - - i i - - - p3=sp1loop0(#r7:2,Rs)
0 1 1 0 0 0 0 0 1 1 0 s s s s s P P - i i i i i - - - i i - - - p3=sp2loop0(#r7:2,Rs)
0 1 1 0 0 0 0 0 1 1 1 s s s s s P P - i i i i i - - - i i - - - p3=sp3loop0(#r7:2,Rs)
ICLASS sm Parse
0 1 1 0 1 0 0 1 1 0 1 I I I I I P P - i i i i i I I I i i - I I p3=sp1loop0(#r7:2,#U10)
0 1 1 0 1 0 0 1 1 1 0 I I I I I P P - i i i i i I I I i i - I I p3=sp2loop0(#r7:2,#U10)
0 1 1 0 1 0 0 1 1 1 1 I I I I I P P - i i i i i I I I i i - I I p3=sp3loop0(#r7:2,#U10)
Syntax Behavior
Pd=Ps Assembler mapped to: "Pd=or(Ps,Ps)"
Pd=and(Ps,and(Pt,[!]Pu)) PREDUSE_TIMING;
Pd = Ps & Pt & (~Pu);
Pd=and(Ps,or(Pt,[!]Pu)) PREDUSE_TIMING;
Pd = Ps & (Pt | (~Pu));
Pd=and(Pt,[!]Ps) PREDUSE_TIMING;
Pd=Pt & (~Ps);
Pd=not(Ps) PREDUSE_TIMING;
Pd=~Ps;
Pd=or(Ps,and(Pt,[!]Pu)) PREDUSE_TIMING;
Pd = Ps | (Pt & (~Pu));
Pd=or(Ps,or(Pt,[!]Pu)) PREDUSE_TIMING;
Pd = Ps | Pt | (~Pu);
Pd=or(Pt,[!]Ps) PREDUSE_TIMING;
Pd=Pt | (~Ps);
Pd=xor(Ps,Pt) PREDUSE_TIMING;
Pd=Ps ^ Pt;
Notes
■ This instruction may execute on either slot2 or slot3, even though it is a CR-type
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS sm s2 Parse t2 d2
0 1 1 0 1 0 1 1 0 0 0 0 - - s s P P 0 - - - t t - - - - - - d d Pd=and(Pt,Ps)
ICLASS sm s2 Parse t2 u2 d2
0 1 1 0 1 0 1 1 0 0 0 1 - - s s P P 0 - - - t t u u - - - - d d Pd=and(Ps,and(Pt,Pu))
ICLASS sm s2 Parse t2 d2
0 1 1 0 1 0 1 1 0 0 1 0 - - s s P P 0 - - - t t - - - - - - d d Pd=or(Pt,Ps)
ICLASS sm s2 Parse t2 u2 d2
0 1 1 0 1 0 1 1 0 0 1 1 - - s s P P 0 - - - t t u u - - - - d d Pd=and(Ps,or(Pt,Pu))
ICLASS sm s2 Parse t2 d2
0 1 1 0 1 0 1 1 0 1 0 0 - - s s P P 0 - - - t t - - - - - - d d Pd=xor(Ps,Pt)
ICLASS sm s2 Parse t2 u2 d2
0 1 1 0 1 0 1 1 0 1 0 1 - - s s P P 0 - - - t t u u - - - - d d Pd=or(Ps,and(Pt,Pu))
ICLASS sm s2 Parse t2 d2
0 1 1 0 1 0 1 1 0 1 1 0 - - s s P P 0 - - - t t - - - - - - d d Pd=and(Pt,!Ps)
ICLASS sm s2 Parse t2 u2 d2
0 1 1 0 1 0 1 1 0 1 1 1 - - s s P P 0 - - - t t u u - - - - d d Pd=or(Ps,or(Pt,Pu))
0 1 1 0 1 0 1 1 1 0 0 1 - - s s P P 0 - - - t t u u - - - - d d Pd=and(Ps,and(Pt,!Pu))
0 1 1 0 1 0 1 1 1 0 1 1 - - s s P P 0 - - - t t u u - - - - d d Pd=and(Ps,or(Pt,!Pu))
ICLASS sm s2 Parse d2
0 1 1 0 1 0 1 1 1 1 0 0 - - s s P P 0 - - - - - - - - - - - d d Pd=not(Ps)
ICLASS sm s2 Parse t2 u2 d2
0 1 1 0 1 0 1 1 1 1 0 1 - - s s P P 0 - - - t t u u - - - - d d Pd=or(Ps,and(Pt,!Pu))
ICLASS sm s2 Parse t2 d2
0 1 1 0 1 0 1 1 1 1 1 0 - - s s P P 0 - - - t t - - - - - - d d Pd=or(Pt,!Ps)
ICLASS sm s2 Parse t2 u2 d2
0 1 1 0 1 0 1 1 1 1 1 1 - - s s P P 0 - - - t t u u - - - - d d Pd=or(Ps,or(Pt,!Pu))
1 LC0 9 PC
2 SA1 10 UGP
3 LC1 11 GP
Reserved Reserved
4 P3:0 12 CS0
5 Reserved 13 CS1
6 M0 14 UPCYCLELO
7 M1 UPCYCLEHI
15 23 31
Figure 11-1 User control registers and their register field encodings
Syntax Behavior
Cd=Rs Cd=Rs;
Cdd=Rss Cdd=Rss;
Rd=Cs Rd=Cs;
Rdd=Css Rdd=Css;
Class: CR (slot 3)
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS sm s5 Parse d5
0 1 1 0 0 0 1 0 0 0 1 s s s s s P P - - - - - - - - - d d d d d Cd=Rs
0 1 1 0 0 0 1 1 0 0 1 s s s s s P P - - - - - - - - - d d d d d Cdd=Rss
0 1 1 0 1 0 0 0 0 0 0 s s s s s P P - - - - - - - - - d d d d d Rdd=Css
0 1 1 0 1 0 1 0 0 0 0 s s s s s P P - - - - - - - - - d d d d d Rd=Cs
11.3 JR
The JR instruction class includes instructions to change the program flow to a new location
contained in a register.
JR instructions are executable on slot 2.
Class: JR (slot 2)
Notes
■ This instruction can conditionally execute based on the value of a predicate register. If the
instruction is preceded by 'if Pn', the instruction only executes if the least-significant bit of the
predicate register is 1. Similarly, if the instruction is preceded by 'if !Pn', the instruction
executes only if the least-significant bit of Pn is 0.
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse
0 1 0 1 0 0 0 0 1 0 1 s s s s s P P - - - - - - - - - - - - - - callr Rs
0 1 0 1 0 0 0 0 1 1 0 s s s s s P P - - - - - - - - - - - - - - callrh Rs
ICLASS s5 Parse u2
0 1 0 1 0 0 0 1 0 0 0 s s s s s P P - - - - u u - - - - - - - - if (Pu) callr Rs
0 1 0 1 0 0 0 1 0 0 1 s s s s s P P - - - - u u - - - - - - - - if (!Pu) callr Rs
Syntax Behavior
callrh Rs LR=NPC;
PC=Rs;
Class: JR (slot 2)
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse
0 1 0 1 0 0 0 0 1 1 0 s s s s s P P - - - - - - - - - - - - - - callrh Rs
Syntax Behavior
hintjr(Rs) ;
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse
0 1 0 1 0 0 1 0 1 0 1 s s s s s P P - - - - - - - - - - - - - - hintjr(Rs)
Syntax Behavior
if ([!]Pu) jumpr Rs Assembler mapped to: "if ([!]Pu)
""jumpr"":nt ""Rs"
if ([!]Pu[.new]) {
jumpr:<hint> Rs if([!]Pu[.new][0]){
PC=Rs;
}
jumpr Rs PC=Rs;
jumprh Rs PC=Rs;
Class: JR (slot 2)
Notes
■ This instruction can conditionally execute based on the value of a predicate register. If the
instruction is preceded by 'if Pn', the instruction only executes if the least-significant bit of the
predicate register is 1. Similarly, if the instruction is preceded by 'if !Pn', the instruction
executes only if the least-significant bit of Pn is 0.
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse
0 1 0 1 0 0 1 0 1 0 0 s s s s s P P - - - - - - - - - - - - - - jumpr Rs
0 1 0 1 0 0 1 0 1 1 0 s s s s s P P - - - - - - - - - - - - - - jumprh Rs
ICLASS s5 Parse u2
0 1 0 1 0 0 1 1 0 1 0 s s s s s P P - 0 0 - u u - - - - - - - - if (Pu) jumpr:nt Rs
0 1 0 1 0 0 1 1 0 1 0 s s s s s P P - 0 1 - u u - - - - - - - - if (Pu.new) jumpr:nt Rs
0 1 0 1 0 0 1 1 0 1 0 s s s s s P P - 1 0 - u u - - - - - - - - if (Pu) jumpr:t Rs
0 1 0 1 0 0 1 1 0 1 0 s s s s s P P - 1 1 - u u - - - - - - - - if (Pu.new) jumpr:t Rs
0 1 0 1 0 0 1 1 0 1 1 s s s s s P P - 0 0 - u u - - - - - - - - if (!Pu) jumpr:nt Rs
0 1 0 1 0 0 1 1 0 1 1 s s s s s P P - 0 1 - u u - - - - - - - - if (!Pu.new) jumpr:nt Rs
0 1 0 1 0 0 1 1 0 1 1 s s s s s P P - 1 0 - u u - - - - - - - - if (!Pu) jumpr:t Rs
0 1 0 1 0 0 1 1 0 1 1 s s s s s P P - 1 1 - u u - - - - - - - - if (!Pu.new) jumpr:t Rs
Syntax Behavior
jumprh Rs PC=Rs;
Class: JR (slot 2)
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse
0 1 0 1 0 0 1 0 1 1 0 s s s s s P P - - - - - - - - - - - - - - jumprh Rs
11.4 J
The J instruction class includes branch instructions (jumps and calls) that obtain the target
address from a (PC-relative) immediate address value.
J instructions are executable on slot 2 and slot 3.
Call subroutine
Change the program flow to a subroutine. This instruction first transfers the next program
counter (NPC) value into the link register, and then jumps to the target address.
This instruction can appear in slots 2 or 3.
Syntax Behavior
call #r22:2 apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
LR=NPC;
PC=PC+#r;
if ([!]Pu) call #r15:2 apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
if ([!]Pu[0]) {
LR=NPC;
PC=PC+#r;
}
Notes
■ This instruction can conditionally execute based on the value of a predicate register. If the
instruction is preceded by 'if Pn', the instruction only executes if the least-significant bit of the
predicate register is 1. Similarly, if the instruction is preceded by 'if !Pn', the instruction is
executed only if the least-significant bit of Pn is 0.
■ The Next PC value is the address immediately following the last instruction in the packet
containing this instruction.
■ The PC value is the address of the start of the packet
■ A PC-relative address is formed by taking the decoded immediate value and adding it to the
current PC value.
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS Parse
0 1 0 1 1 0 1 i i i i i i i i i P P i i i i i i i i i i i i i 0 call #r22:2
D
ICLASS Parse N u2
Syntax Behavior
p[01]=cmp.eq(Rs,#-1); if P[01]=(Rs==-1) ? 0xff : 0x00 if
([!]p[01].new) jump:<hint> #r9:2 ([!]P[01].new[0]) {
apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
PC=PC+#r;
}
p[01]=cmp.eq(Rs,#U5); if P[01]=(Rs==#U) ? 0xff : 0x00 if
([!]p[01].new) jump:<hint> #r9:2 ([!]P[01].new[0]) {
apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
PC=PC+#r;
}
p[01]=cmp.eq(Rs,Rt); if P[01]=(Rs==Rt) ? 0xff : 0x00 if
([!]p[01].new) jump:<hint> #r9:2 ([!]P[01].new[0]) {
apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
PC=PC+#r;
}
p[01]=cmp.gt(Rs,#-1); if P[01]=(Rs>-1) ? 0xff : 0x00 if
([!]p[01].new) jump:<hint> #r9:2 ([!]P[01].new[0]) {
apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
PC=PC+#r;
}
p[01]=cmp.gt(Rs,#U5); if P[01]=(Rs>#U) ? 0xff : 0x00 if
([!]p[01].new) jump:<hint> #r9:2 ([!]P[01].new[0]) {
apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
PC=PC+#r;
}
p[01]=cmp.gt(Rs,Rt); if P[01]=(Rs>Rt) ? 0xff : 0x00 if
([!]p[01].new) jump:<hint> #r9:2 ([!]P[01].new[0]) {
apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
PC=PC+#r;
}
Syntax Behavior
p[01]=cmp.gtu(Rs,#U5); if P[01]=(Rs.uw[0]>#U) ? 0xff : 0x00 if
([!]p[01].new) jump:<hint> #r9:2 ([!]P[01].new[0]) {
apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
PC=PC+#r;
}
p[01]=cmp.gtu(Rs,Rt); if P[01]=(Rs.uw[0]>Rt) ? 0xff : 0x00 if
([!]p[01].new) jump:<hint> #r9:2 ([!]P[01].new[0]) {
apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
PC=PC+#r;
}
p[01]=tstbit(Rs,#0); if P[01]=(Rs & 1) ? 0xff : 0x00 if
([!]p[01].new) jump:<hint> #r9:2 ([!]P[01].new[0]) {
apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
PC=PC+#r;
}
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s4 Parse
p0=cmp.eq(Rs,#-1); if
0 0 0 1 0 0 0 1 1 0 i i s s s s P P 0 - - - 0 0 i i i i i i i -
(p0.new) jump:nt #r9:2
p0=cmp.gt(Rs,#-1); if
0 0 0 1 0 0 0 1 1 0 i i s s s s P P 0 - - - 0 1 i i i i i i i -
(p0.new) jump:nt #r9:2
p0=tstbit(Rs,#0); if (p0.new)
0 0 0 1 0 0 0 1 1 0 i i s s s s P P 0 - - - 1 1 i i i i i i i -
jump:nt #r9:2
p0=cmp.eq(Rs,#-1); if
0 0 0 1 0 0 0 1 1 0 i i s s s s P P 1 - - - 0 0 i i i i i i i -
(p0.new) jump:t #r9:2
p0=cmp.gt(Rs,#-1); if
0 0 0 1 0 0 0 1 1 0 i i s s s s P P 1 - - - 0 1 i i i i i i i -
(p0.new) jump:t #r9:2
p0=tstbit(Rs,#0); if (p0.new)
0 0 0 1 0 0 0 1 1 0 i i s s s s P P 1 - - - 1 1 i i i i i i i - jump:t #r9:2
p0=cmp.eq(Rs,#-1); if
0 0 0 1 0 0 0 1 1 1 i i s s s s P P 0 - - - 0 0 i i i i i i i -
(!p0.new) jump:nt #r9:2
p0=cmp.gt(Rs,#-1); if
0 0 0 1 0 0 0 1 1 1 i i s s s s P P 0 - - - 0 1 i i i i i i i -
(!p0.new) jump:nt #r9:2
p0=tstbit(Rs,#0); if
0 0 0 1 0 0 0 1 1 1 i i s s s s P P 0 - - - 1 1 i i i i i i i -
(!p0.new) jump:nt #r9:2
p0=cmp.eq(Rs,#-1); if
0 0 0 1 0 0 0 1 1 1 i i s s s s P P 1 - - - 0 0 i i i i i i i -
(!p0.new) jump:t #r9:2
p0=cmp.gt(Rs,#-1); if
0 0 0 1 0 0 0 1 1 1 i i s s s s P P 1 - - - 0 1 i i i i i i i -
(!p0.new) jump:t #r9:2
p0=tstbit(Rs,#0); if
0 0 0 1 0 0 0 1 1 1 i i s s s s P P 1 - - - 1 1 i i i i i i i -
(!p0.new) jump:t #r9:2
p0=cmp.eq(Rs,#U5); if
0 0 0 1 0 0 0 0 0 0 i i s s s s P P 0 I I I I I i i i i i i i - (p0.new) jump:nt #r9:2
p0=cmp.eq(Rs,#U5); if
0 0 0 1 0 0 0 0 0 0 i i s s s s P P 1 I I I I I i i i i i i i -
(p0.new) jump:t #r9:2
p0=cmp.eq(Rs,#U5); if
0 0 0 1 0 0 0 0 0 1 i i s s s s P P 0 I I I I I i i i i i i i - (!p0.new) jump:nt #r9:2
p0=cmp.eq(Rs,#U5); if
0 0 0 1 0 0 0 0 0 1 i i s s s s P P 1 I I I I I i i i i i i i -
(!p0.new) jump:t #r9:2
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
p0=cmp.gt(Rs,#U5); if
0 0 0 1 0 0 0 0 1 0 i i s s s s P P 0 I I I I I i i i i i i i -
(p0.new) jump:nt #r9:2
0 0 0 1 0 0 0 0 1 0 i i s s s s P P 1 I I I I I i i i i i i i - p0=cmp.gt(Rs,#U5); if
(p0.new) jump:t #r9:2
p0=cmp.gt(Rs,#U5); if
0 0 0 1 0 0 0 0 1 1 i i s s s s P P 0 I I I I I i i i i i i i -
(!p0.new) jump:nt #r9:2
p0=cmp.gt(Rs,#U5); if
0 0 0 1 0 0 0 0 1 1 i i s s s s P P 1 I I I I I i i i i i i i - (!p0.new) jump:t #r9:2
p0=cmp.gtu(Rs,#U5); if
0 0 0 1 0 0 0 1 0 0 i i s s s s P P 0 I I I I I i i i i i i i -
(p0.new) jump:nt #r9:2
p0=cmp.gtu(Rs,#U5); if
0 0 0 1 0 0 0 1 0 0 i i s s s s P P 1 I I I I I i i i i i i i -
(p0.new) jump:t #r9:2
p0=cmp.gtu(Rs,#U5); if
0 0 0 1 0 0 0 1 0 1 i i s s s s P P 0 I I I I I i i i i i i i -
(!p0.new) jump:nt #r9:2
0 0 0 1 0 0 0 1 0 1 i i s s s s P P 1 I I I I I i i i i i i i - p0=cmp.gtu(Rs,#U5); if
(!p0.new) jump:t #r9:2
p1=cmp.eq(Rs,#-1); if
0 0 0 1 0 0 1 1 1 0 i i s s s s P P 0 - - - 0 0 i i i i i i i -
(p1.new) jump:nt #r9:2
p1=cmp.gt(Rs,#-1); if
0 0 0 1 0 0 1 1 1 0 i i s s s s P P 0 - - - 0 1 i i i i i i i - (p1.new) jump:nt #r9:2
p1=tstbit(Rs,#0); if (p1.new)
0 0 0 1 0 0 1 1 1 0 i i s s s s P P 0 - - - 1 1 i i i i i i i -
jump:nt #r9:2
0 0 0 1 0 0 1 1 1 0 i i s s s s P P 1 - - - 0 0 i i i i i i i - p1=cmp.eq(Rs,#-1); if
(p1.new) jump:t #r9:2
p1=cmp.gt(Rs,#-1); if
0 0 0 1 0 0 1 1 1 0 i i s s s s P P 1 - - - 0 1 i i i i i i i -
(p1.new) jump:t #r9:2
0 0 0 1 0 0 1 1 1 0 i i s s s s P P 1 - - - 1 1 i i i i i i i - p1=tstbit(Rs,#0); if (p1.new)
jump:t #r9:2
p1=cmp.eq(Rs,#-1); if
0 0 0 1 0 0 1 1 1 1 i i s s s s P P 0 - - - 0 0 i i i i i i i -
(!p1.new) jump:nt #r9:2
p1=cmp.gt(Rs,#-1); if
0 0 0 1 0 0 1 1 1 1 i i s s s s P P 0 - - - 0 1 i i i i i i i -
(!p1.new) jump:nt #r9:2
0 0 0 1 0 0 1 1 1 1 i i s s s s P P 0 - - - 1 1 i i i i i i i - p1=tstbit(Rs,#0); if
(!p1.new) jump:nt #r9:2
p1=cmp.eq(Rs,#-1); if
0 0 0 1 0 0 1 1 1 1 i i s s s s P P 1 - - - 0 0 i i i i i i i -
(!p1.new) jump:t #r9:2
p1=cmp.gt(Rs,#-1); if
0 0 0 1 0 0 1 1 1 1 i i s s s s P P 1 - - - 0 1 i i i i i i i -
(!p1.new) jump:t #r9:2
0 0 0 1 0 0 1 1 1 1 i i s s s s P P 1 - - - 1 1 i i i i i i i - p1=tstbit(Rs,#0); if
(!p1.new) jump:t #r9:2
p1=cmp.eq(Rs,#U5); if
0 0 0 1 0 0 1 0 0 0 i i s s s s P P 0 I I I I I i i i i i i i -
(p1.new) jump:nt #r9:2
p1=cmp.eq(Rs,#U5); if
0 0 0 1 0 0 1 0 0 0 i i s s s s P P 1 I I I I I i i i i i i i - (p1.new) jump:t #r9:2
p1=cmp.eq(Rs,#U5); if
0 0 0 1 0 0 1 0 0 1 i i s s s s P P 0 I I I I I i i i i i i i -
(!p1.new) jump:nt #r9:2
p1=cmp.eq(Rs,#U5); if
0 0 0 1 0 0 1 0 0 1 i i s s s s P P 1 I I I I I i i i i i i i -
(!p1.new) jump:t #r9:2
p1=cmp.gt(Rs,#U5); if
0 0 0 1 0 0 1 0 1 0 i i s s s s P P 0 I I I I I i i i i i i i -
(p1.new) jump:nt #r9:2
p1=cmp.gt(Rs,#U5); if
0 0 0 1 0 0 1 0 1 0 i i s s s s P P 1 I I I I I i i i i i i i -
(p1.new) jump:t #r9:2
p1=cmp.gt(Rs,#U5); if
0 0 0 1 0 0 1 0 1 1 i i s s s s P P 0 I I I I I i i i i i i i -
(!p1.new) jump:nt #r9:2
p1=cmp.gt(Rs,#U5); if
0 0 0 1 0 0 1 0 1 1 i i s s s s P P 1 I I I I I i i i i i i i -
(!p1.new) jump:t #r9:2
p1=cmp.gtu(Rs,#U5); if
0 0 0 1 0 0 1 1 0 0 i i s s s s P P 0 I I I I I i i i i i i i -
(p1.new) jump:nt #r9:2
p1=cmp.gtu(Rs,#U5); if
0 0 0 1 0 0 1 1 0 0 i i s s s s P P 1 I I I I I i i i i i i i - (p1.new) jump:t #r9:2
p1=cmp.gtu(Rs,#U5); if
0 0 0 1 0 0 1 1 0 1 i i s s s s P P 0 I I I I I i i i i i i i -
(!p1.new) jump:nt #r9:2
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
p1=cmp.gtu(Rs,#U5); if
0 0 0 1 0 0 1 1 0 1 i i s s s s P P 1 I I I I I i i i i i i i -
(!p1.new) jump:t #r9:2
ICLASS s4 Parse t4
p0=cmp.eq(Rs,Rt); if
0 0 0 1 0 1 0 0 0 0 i i s s s s P P 0 0 t t t t i i i i i i i -
(p0.new) jump:nt #r9:2
0 0 0 1 0 1 0 0 0 0 i i s s s s P P 0 1 t t t t i i i i i i i - p1=cmp.eq(Rs,Rt); if
(p1.new) jump:nt #r9:2
p0=cmp.eq(Rs,Rt); if
0 0 0 1 0 1 0 0 0 0 i i s s s s P P 1 0 t t t t i i i i i i i -
(p0.new) jump:t #r9:2
p1=cmp.eq(Rs,Rt); if
0 0 0 1 0 1 0 0 0 0 i i s s s s P P 1 1 t t t t i i i i i i i - (p1.new) jump:t #r9:2
p0=cmp.eq(Rs,Rt); if
0 0 0 1 0 1 0 0 0 1 i i s s s s P P 0 0 t t t t i i i i i i i -
(!p0.new) jump:nt #r9:2
p1=cmp.eq(Rs,Rt); if
0 0 0 1 0 1 0 0 0 1 i i s s s s P P 0 1 t t t t i i i i i i i -
(!p1.new) jump:nt #r9:2
p0=cmp.eq(Rs,Rt); if
0 0 0 1 0 1 0 0 0 1 i i s s s s P P 1 0 t t t t i i i i i i i -
(!p0.new) jump:t #r9:2
p1=cmp.eq(Rs,Rt); if
0 0 0 1 0 1 0 0 0 1 i i s s s s P P 1 1 t t t t i i i i i i i -
(!p1.new) jump:t #r9:2
p0=cmp.gt(Rs,Rt); if
0 0 0 1 0 1 0 0 1 0 i i s s s s P P 0 0 t t t t i i i i i i i -
(p0.new) jump:nt #r9:2
p1=cmp.gt(Rs,Rt); if
0 0 0 1 0 1 0 0 1 0 i i s s s s P P 0 1 t t t t i i i i i i i -
(p1.new) jump:nt #r9:2
p0=cmp.gt(Rs,Rt); if
0 0 0 1 0 1 0 0 1 0 i i s s s s P P 1 0 t t t t i i i i i i i -
(p0.new) jump:t #r9:2
0 0 0 1 0 1 0 0 1 0 i i s s s s P P 1 1 t t t t i i i i i i i - p1=cmp.gt(Rs,Rt); if
(p1.new) jump:t #r9:2
p0=cmp.gt(Rs,Rt); if
0 0 0 1 0 1 0 0 1 1 i i s s s s P P 0 0 t t t t i i i i i i i - (!p0.new) jump:nt #r9:2
0 0 0 1 0 1 0 0 1 1 i i s s s s P P 0 1 t t t t i i i i i i i - p1=cmp.gt(Rs,Rt); if
(!p1.new) jump:nt #r9:2
0 0 0 1 0 1 0 0 1 1 i i s s s s P P 1 0 t t t t i i i i i i i - p0=cmp.gt(Rs,Rt); if
(!p0.new) jump:t #r9:2
p1=cmp.gt(Rs,Rt); if
0 0 0 1 0 1 0 0 1 1 i i s s s s P P 1 1 t t t t i i i i i i i - (!p1.new) jump:t #r9:2
p0=cmp.gtu(Rs,Rt); if
0 0 0 1 0 1 0 1 0 0 i i s s s s P P 0 0 t t t t i i i i i i i -
(p0.new) jump:nt #r9:2
p1=cmp.gtu(Rs,Rt); if
0 0 0 1 0 1 0 1 0 0 i i s s s s P P 0 1 t t t t i i i i i i i -
(p1.new) jump:nt #r9:2
p0=cmp.gtu(Rs,Rt); if
0 0 0 1 0 1 0 1 0 0 i i s s s s P P 1 0 t t t t i i i i i i i -
(p0.new) jump:t #r9:2
p1=cmp.gtu(Rs,Rt); if
0 0 0 1 0 1 0 1 0 0 i i s s s s P P 1 1 t t t t i i i i i i i -
(p1.new) jump:t #r9:2
p0=cmp.gtu(Rs,Rt); if
0 0 0 1 0 1 0 1 0 1 i i s s s s P P 0 0 t t t t i i i i i i i -
(!p0.new) jump:nt #r9:2
p1=cmp.gtu(Rs,Rt); if
0 0 0 1 0 1 0 1 0 1 i i s s s s P P 0 1 t t t t i i i i i i i -
(!p1.new) jump:nt #r9:2
0 0 0 1 0 1 0 1 0 1 i i s s s s P P 1 0 t t t t i i i i i i i - p0=cmp.gtu(Rs,Rt); if
(!p0.new) jump:t #r9:2
p1=cmp.gtu(Rs,Rt); if
0 0 0 1 0 1 0 1 0 1 i i s s s s P P 1 1 t t t t i i i i i i i -
(!p1.new) jump:t #r9:2
Jump to address
Change the program flow to a target address. This instruction changes the program counter to a
target address that is relative to the PC address. The offset from the current PC address is
contained in the instruction encoding.
A speculated jump instruction includes a hint ("taken" or "not taken") that specifies the expected
value of the conditional expression. If the actual generated value of the predicate differs from this
expected value, the jump instruction incurs a performance penalty.
This instruction can appear in slots 2 or 3.
Syntax Behavior
if ([!]Pu) jump #r15:2 Assembler mapped to: "if ([!]Pu) ""jump"":nt
""#r15:2"
if ([!]Pu) jump:<hint> if ([!]Pu[0]) {
#r15:2 apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
PC=PC+#r;
}
jump #r22:2 apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
PC=PC+#r;
Notes
■ This instruction can conditionally execute based on the value of a predicate register. If the
instruction is preceded by 'if Pn', the instruction only executes if the least-significant bit of the
predicate register is 1. Similarly, if the instruction is preceded by 'if !Pn', the instruction
executes only if the least-significant bit of Pn is 0.
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS Parse
0 1 0 1 1 0 0 i i i i i i i i i P P i i i i i i i i i i i i i - jump #r22:2
D
ICLASS Parse PT N u2
Syntax Behavior
if ([!]Pu.new) jump:<hint> {
#r15:2 if([!]Pu.new[0]){
apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
PC=PC+#r;
}
Notes
■ This instruction can conditionally execute based on the value of a predicate register. If the
instruction is preceded by 'if Pn', the instruction only executes if the least-significant bit of the
predicate register is 1. Similarly, if the instruction is preceded by 'if !Pn', the instruction
executes only if the least-significant bit of Pn is 0.
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
D
ICLASS Parse PT N u2
Syntax Behavior
if (Rs!=#0) jump:nt #r13:2 if (Rs != 0) {
PC=PC+#r;
}
if (Rs!=#0) jump:t #r13:2 if (Rs != 0) {
PC=PC+#r;
}
if (Rs<=#0) jump:nt #r13:2 if (Rs<=0) {
PC=PC+#r;
}
if (Rs<=#0) jump:t #r13:2 if (Rs<=0) {
PC=PC+#r;
}
if (Rs==#0) jump:nt #r13:2 if (Rs == 0) {
PC=PC+#r;
}
if (Rs==#0) jump:t #r13:2 if (Rs == 0) {
PC=PC+#r;
}
if (Rs>=#0) jump:nt #r13:2 if (Rs>=0) {
PC=PC+#r;
}
if (Rs>=#0) jump:t #r13:2 if (Rs>=0) {
PC=PC+#r;
}
Class: J (slot 3)
Notes
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS sm s5 Parse
0 1 1 0 0 0 0 1 0 0 i s s s s s P P i 0 i i i i i i i i i i i - if (Rs!=#0) jump:nt #r13:2
0 1 1 0 0 0 0 1 0 0 i s s s s s P P i 1 i i i i i i i i i i i - if (Rs!=#0) jump:t #r13:2
0 1 1 0 0 0 0 1 0 1 i s s s s s P P i 0 i i i i i i i i i i i - if (Rs>=#0) jump:nt #r13:2
0 1 1 0 0 0 0 1 0 1 i s s s s s P P i 1 i i i i i i i i i i i - if (Rs>=#0) jump:t #r13:2
0 1 1 0 0 0 0 1 1 0 i s s s s s P P i 0 i i i i i i i i i i i - if (Rs==#0) jump:nt #r13:2
0 1 1 0 0 0 0 1 1 0 i s s s s s P P i 1 i i i i i i i i i i i - if (Rs==#0) jump:t #r13:2
0 1 1 0 0 0 0 1 1 1 i s s s s s P P i 0 i i i i i i i i i i i - if (Rs<=#0) jump:nt #r13:2
0 1 1 0 0 0 0 1 1 1 i s s s s s P P i 1 i i i i i i i i i i i - if (Rs<=#0) jump:t #r13:2
Syntax Behavior
Rd=#U6 ; jump #r9:2 apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
Rd=#U;
PC=PC+#r;
Rd=Rs ; jump #r9:2 apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
Rd=Rs;
PC=PC+#r;
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS d4 Parse
0 0 0 1 0 1 1 0 - - i i d d d d P P I I I I I I i i i i i i i - Rd=#U6 ; jump #r9:2
ICLASS s4 Parse d4
0 0 0 1 0 1 1 1 - - i i s s s s P P - - d d d d i i i i i i i - Rd=Rs ; jump #r9:2
11.5 LD
The LD instruction class includes load instructions, which are used to load values into registers.
LD instructions are executable on slot 0 and slot 1.
Load doubleword
Load a 64-bit doubleword from memory and place in a destination register pair.
Syntax Behavior
Rdd=memd(Re=#U6) apply_extension(#U);
EA=#U;
Rdd = *EA;
Re=#U;
Rdd=memd(Rs+#s11:3) apply_extension(#s);
EA=Rs+#s;
Rdd = *EA;
Rdd=memd(Rs+Rt<<#u2) EA=Rs+(Rt<<#u);
Rdd = *EA;
Rdd=memd(Rt<<#u2+#U6) apply_extension(#U);
EA=#U+(Rt<<#u);
Rdd = *EA;
Rdd=memd(Rx++#s4:3) EA=Rx;
Rx=Rx+#s;
Rdd = *EA;
Rdd=memd(Rx++#s4:3:circ(Mu)) EA=Rx;
Rx=Rx=circ_add(Rx,#s,MuV);
Rdd = *EA;
Rdd=memd(Rx++I:circ(Mu)) EA=Rx;
Rx=Rx=circ_add(Rx,I<<3,MuV);
Rdd = *EA;
Rdd=memd(Rx++Mu) EA=Rx;
Rx=Rx+MuV;
Rdd = *EA;
Rdd=memd(Rx++Mu:brev) EA=Rx.h[1] | brev(Rx.h[0]);
Rx=Rx+MuV;
Rdd = *EA;
Rdd=memd(gp+#u16:3) apply_extension(#u);
EA=(Constant_extended ? (0) : GP)+#u;
Rdd = *EA;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse t5 d5
0 0 1 1 1 0 1 0 1 1 0 s s s s s P P i t t t t t i - - d d d d d Rdd=memd(Rs+Rt<<#u2)
U
ICLASS Type Parse d5
N
0 1 0 0 1 i i 1 1 1 0 i i i i i P P i i i i i i i i i d d d d d Rdd=memd(gp+#u16:3)
U
ICLASS Amode Type s5 Parse d5
N
1 0 0 1 0 i i 1 1 1 0 s s s s s P P i i i i i i i i i d d d d d Rdd=memd(Rs+#s11:3)
U
ICLASS Amode Type x5 Parse u1 d5
N
1 0 0 1 1 0 0 1 1 1 0 x x x x x P P u 0 - - 0 i i i i d d d d d Rdd=memd(Rx++#s4:3:circ
(Mu))
1 0 0 1 1 0 0 1 1 1 0 x x x x x P P u 0 - - 1 - 0 - - d d d d d Rdd=memd(Rx++I:circ(Mu)
)
U
ICLASS Amode Type e5 Parse d5
N
1 0 0 1 1 0 1 1 1 1 0 e e e e e P P 0 1 I I I I - I I d d d d d Rdd=memd(Re=#U6)
U
ICLASS Amode Type N x5 Parse d5
1 0 0 1 1 0 1 1 1 1 0 x x x x x P P 0 0 - - - i i i i d d d d d Rdd=memd(Rx++#s4:3)
U
ICLASS Amode Type N
t5 Parse d5
1 0 0 1 1 1 0 1 1 1 0 t t t t t P P i 1 I I I I i I I d d d d d Rdd=memd(Rt<<#u2+#U6)
U
ICLASS Amode Type x5 Parse u1 d5
N
1 0 0 1 1 1 0 1 1 1 0 x x x x x P P u 0 - - - - 0 - - d d d d d Rdd=memd(Rx++Mu)
1 0 0 1 1 1 1 1 1 1 0 x x x x x P P u 0 - - - - 0 - - d d d d d Rdd=memd(Rx++Mu:brev)
Load-acquire doubleword
Load a 64-bit doubleword from memory and place in a destination register pair. The load-acquire
memory operation is observed before any following memory operations (in program order) have
been observed at the local point of serialization. A different order may be observed at the global
point of serialization. (see Ordering and Synchronization).
Syntax Behavior
Rdd=memd_aq(Rs) EA=Rs;
Rdd = *EA
Class: LD (slots 0)
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
U
ICLASS Amode Type s5 Parse d5
N
1 0 0 1 0 0 1 0 0 0 0 s s s s s P P 0 1 1 - - - 0 0 0 d d d d d Rdd=memd_aq(Rs)
Syntax Behavior
if ([!]Pt[.new]) Rdd=memd(#u6) apply_extension(#u);
EA=#u;
if ([!]Pt[.new][0]) {
Rdd = *EA;
} else {
NOP;
}
if ([!]Pt[.new]) apply_extension(#u);
Rdd=memd(Rs+#u6:3) EA=Rs+#u;
if ([!]Pt[.new][0]) {
Rdd = *EA;
} else {
NOP;
}
if ([!]Pt[.new]) EA=Rx;
Rdd=memd(Rx++#s4:3) if([!]Pt[.new][0]){
Rx=Rx+#s;
Rdd = *EA;
} else {
NOP;
}
if ([!]Pv[.new]) EA=Rs+(Rt<<#u);
Rdd=memd(Rs+Rt<<#u2) if ([!]Pv[.new][0]) {
Rdd = *EA;
} else {
NOP;
}
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse t5 d5
if (Pv)
0 0 1 1 0 0 0 0 1 1 0 s s s s s P P i t t t t t i v v d d d d d Rdd=memd(Rs+Rt<<#u2)
0 0 1 1 0 0 0 1 1 1 0 s s s s s P P i t t t t t i v v d d d d d if (!Pv)
Rdd=memd(Rs+Rt<<#u2)
0 0 1 1 0 0 1 0 1 1 0 s s s s s P P i t t t t t i v v d d d d d if (Pv.new)
Rdd=memd(Rs+Rt<<#u2)
if (!Pv.new)
0 0 1 1 0 0 1 1 1 1 0 s s s s s P P i t t t t t i v v d d d d d Rdd=memd(Rs+Rt<<#u2)
Pr
Se ed
U
ICLASS ns Ne Type s5 Parse t2 d5
N
e
w
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
if (Pt)
0 1 0 0 0 0 0 1 1 1 0 s s s s s P P 0 t t i i i i i i d d d d d
Rdd=memd(Rs+#u6:3)
0 1 0 0 0 0 1 1 1 1 0 s s s s s P P 0 t t i i i i i i d d d d d if (Pt.new)
Rdd=memd(Rs+#u6:3)
if (!Pt)
0 1 0 0 0 1 0 1 1 1 0 s s s s s P P 0 t t i i i i i i d d d d d
Rdd=memd(Rs+#u6:3)
if (!Pt.new)
0 1 0 0 0 1 1 1 1 1 0 s s s s s P P 0 t t i i i i i i d d d d d Rdd=memd(Rs+#u6:3)
U
ICLASS Amode Type x5 Parse t2 d5
N
if (Pt)
1 0 0 1 1 0 1 1 1 1 0 x x x x x P P 1 0 0 t t i i i i d d d d d
Rdd=memd(Rx++#s4:3)
if (!Pt)
1 0 0 1 1 0 1 1 1 1 0 x x x x x P P 1 0 1 t t i i i i d d d d d
Rdd=memd(Rx++#s4:3)
1 0 0 1 1 0 1 1 1 1 0 x x x x x P P 1 1 0 t t i i i i d d d d d if (Pt.new)
Rdd=memd(Rx++#s4:3)
if (!Pt.new)
1 0 0 1 1 0 1 1 1 1 0 x x x x x P P 1 1 1 t t i i i i d d d d d
Rdd=memd(Rx++#s4:3)
U
ICLASS Amode Type N Parse t2 d5
1 0 0 1 1 1 1 1 1 1 0 i i i i i P P 1 0 0 t t i 1 - - d d d d d if (Pt) Rdd=memd(#u6)
1 0 0 1 1 1 1 1 1 1 0 i i i i i P P 1 0 1 t t i 1 - - d d d d d if (!Pt) Rdd=memd(#u6)
1 0 0 1 1 1 1 1 1 1 0 i i i i i P P 1 1 0 t t i 1 - - d d d d d if (Pt.new) Rdd=memd(#u6)
1 0 0 1 1 1 1 1 1 1 0 i i i i i P P 1 1 1 t t i 1 - - d d d d d if (!Pt.new)
Rdd=memd(#u6)
Load byte
Load a signed byte from memory. The byte at the effective address in memory is placed in the
least-significant 8 bits of the destination register. The destination register is then sign-extended
from 8 bits to 32.
Syntax Behavior
Rd=memb(Re=#U6) apply_extension(#U);
EA=#U;
Rd = *EA;
Re=#U;
Rd=memb(Rs+#s11:0) apply_extension(#s);
EA=Rs+#s;
Rd = *EA;
Rd=memb(Rs+Rt<<#u2) EA=Rs+(Rt<<#u);
Rd = *EA;
Rd=memb(Rt<<#u2+#U6) apply_extension(#U);
EA=#U+(Rt<<#u);
Rd = *EA;
Rd=memb(Rx++#s4:0) EA=Rx;
Rx=Rx+#s;
Rd = *EA;
Rd=memb(Rx++#s4:0:circ(Mu)) EA=Rx;
Rx=Rx=circ_add(Rx,#s,MuV);
Rd = *EA;
Rd=memb(Rx++I:circ(Mu)) EA=Rx;
Rx=Rx=circ_add(Rx,I<<0,MuV);
Rd = *EA;
Rd=memb(Rx++Mu) EA=Rx;
Rx=Rx+MuV;
Rd = *EA;
Rd=memb(Rx++Mu:brev) EA=Rx.h[1] | brev(Rx.h[0]);
Rx=Rx+MuV;
Rd = *EA;
Rd=memb(gp+#u16:0) apply_extension(#u);
EA=(Constant_extended ? (0) : GP)+#u;
Rd = *EA;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse t5 d5
0 0 1 1 1 0 1 0 0 0 0 s s s s s P P i t t t t t i - - d d d d d Rd=memb(Rs+Rt<<#u2)
U
ICLASS Type Parse d5
N
0 1 0 0 1 i i 1 0 0 0 i i i i i P P i i i i i i i i i d d d d d Rd=memb(gp+#u16:0)
U
ICLASS Amode Type s5 Parse d5
N
1 0 0 1 0 i i 1 0 0 0 s s s s s P P i i i i i i i i i d d d d d Rd=memb(Rs+#s11:0)
U
ICLASS Amode Type x5 Parse u1 d5
N
Rd=memb(Rx++#s4:0:circ(
1 0 0 1 1 0 0 1 0 0 0 x x x x x P P u 0 - - 0 i i i i d d d d d
Mu))
1 0 0 1 1 0 0 1 0 0 0 x x x x x P P u 0 - - 1 - 0 - - d d d d d Rd=memb(Rx++I:circ(Mu))
U
ICLASS Amode Type e5 Parse d5
N
1 0 0 1 1 0 1 1 0 0 0 e e e e e P P 0 1 I I I I - I I d d d d d Rd=memb(Re=#U6)
Syntax Behavior
if ([!]Pt[.new]) Rd=memb(#u6) apply_extension(#u);
EA=#u;
if ([!]Pt[.new][0]) {
Rd = *EA;
} else {
NOP;
}
if ([!]Pt[.new]) Rd=memb(Rs+#u6:0) apply_extension(#u);
EA=Rs+#u;
if ([!]Pt[.new][0]) {
Rd = *EA;
} else {
NOP;
}
if ([!]Pt[.new]) Rd=memb(Rx++#s4:0) EA=Rx;
if([!]Pt[.new][0]){
Rx=Rx+#s;
Rd = *EA;
} else {
NOP;
}
if ([!]Pv[.new]) Rd=memb(Rs+Rt<<#u2) EA=Rs+(Rt<<#u);
if ([!]Pv[.new][0]) {
Rd = *EA;
} else {
NOP;
}
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse t5 d5
d if
(Pv)
0 0 1 1 0 0 0 0 0 0 0 s s s s s P P i t t t t t i v v d d d d
Rd=memb(Rs+Rt<<#u2)
0 0 1 1 0 0 0 1 0 0 0 s s s s s P P i t t t t t i v v d d d d d if (!Pv)
Rd=memb(Rs+Rt<<#u2)
if (Pv.new)
0 0 1 1 0 0 1 0 0 0 0 s s s s s P P i t t t t t i v v d d d d d Rd=memb(Rs+Rt<<#u2)
d ifRd=memb(Rs+Rt<<#u2)
(!Pv.new)
0 0 1 1 0 0 1 1 0 0 0 s s s s s P P i t t t t t i v v d d d d
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Pr
Se
ed U
ICLASS ns Type s5 Parse t2 d5
Ne N
e
w
0 1 0 0 0 0 0 1 0 0 0 s s s s s P P 0 t t i i i i i i d d d d d if (Pt) Rd=memb(Rs+#u6:0)
if (Pt.new)
0 1 0 0 0 0 1 1 0 0 0 s s s s s P P 0 t t i i i i i i d d d d d
Rd=memb(Rs+#u6:0)
if (!Pt)
0 1 0 0 0 1 0 1 0 0 0 s s s s s P P 0 t t i i i i i i d d d d d
Rd=memb(Rs+#u6:0)
if (!Pt.new)
0 1 0 0 0 1 1 1 0 0 0 s s s s s P P 0 t t i i i i i i d d d d d
Rd=memb(Rs+#u6:0)
U
ICLASS Amode Type x5 Parse t2 d5
N
if (Pt)
1 0 0 1 1 0 1 1 0 0 0 x x x x x P P 1 0 0 t t i i i i d d d d d
Rd=memb(Rx++#s4:0)
if (!Pt)
1 0 0 1 1 0 1 1 0 0 0 x x x x x P P 1 0 1 t t i i i i d d d d d
Rd=memb(Rx++#s4:0)
1 0 0 1 1 0 1 1 0 0 0 x x x x x P P 1 1 0 t t i i i i d d d d d if (Pt.new)
Rd=memb(Rx++#s4:0)
1 0 0 1 1 0 1 1 0 0 0 x x x x x P P 1 1 1 t t i i i i d d d d d if (!Pt.new)
Rd=memb(Rx++#s4:0)
U
ICLASS Amode Type N Parse t2 d5
1 0 0 1 1 1 1 1 0 0 0 i i i i i P P 1 0 0 t t i 1 - - d d d d d if (Pt) Rd=memb(#u6)
1 0 0 1 1 1 1 1 0 0 0 i i i i i P P 1 0 1 t t i 1 - - d d d d d if (!Pt) Rd=memb(#u6)
1 0 0 1 1 1 1 1 0 0 0 i i i i i P P 1 1 0 t t i 1 - - d d d d d if (Pt.new) Rd=memb(#u6)
1 0 0 1 1 1 1 1 0 0 0 i i i i i P P 1 1 1 t t i 1 - - d d d d d if (!Pt.new) Rd=memb(#u6)
Ryy
Syntax Behavior
Ryy=memb_fifo(Re=#U6) apply_extension(#U);
EA=#U;
{
tmpV = *EA;
Ryy = (((size8u_t)Ryy)>>8)|(tmpV<<56);
}
Re=#U;
Ryy=memb_fifo(Rs) Assembler mapped to:
"Ryy=memb_fifo""(Rs+#0)"
Ryy=memb_fifo(Rs+#s11:0) apply_extension(#s);
EA=Rs+#s;
{
tmpV = *EA;
Ryy = (((size8u_t)Ryy)>>8)|(tmpV<<56);
}
Ryy=memb_fifo(Rt<<#u2+#U6) apply_extension(#U);
EA=#U+(Rt<<#u);
{
tmpV = *EA;
Ryy = (((size8u_t)Ryy)>>8)|(tmpV<<56);
}
Ryy=memb_fifo(Rx++#s4:0) EA=Rx;
Rx=Rx+#s;
{
tmpV = *EA;
Ryy = (((size8u_t)Ryy)>>8)|(tmpV<<56);
}
Ryy=memb_fifo(Rx++#s4:0:circ( EA=Rx;
Mu)) Rx=Rx=circ_add(Rx,#s,MuV);
{
tmpV = *EA;
Ryy = (((size8u_t)Ryy)>>8)|(tmpV<<56);
}
Syntax Behavior
Ryy=memb_fifo(Rx++I:circ(Mu)) EA=Rx;
Rx=Rx=circ_add(Rx,I<<0,MuV);
{
tmpV = *EA;
Ryy = (((size8u_t)Ryy)>>8)|(tmpV<<56);
}
Ryy=memb_fifo(Rx++Mu) EA=Rx;
Rx=Rx+MuV;
{
tmpV = *EA;
Ryy = (((size8u_t)Ryy)>>8)|(tmpV<<56);
}
Ryy=memb_fifo(Rx++Mu:brev) EA=Rx.h[1] | brev(Rx.h[0]);
Rx=Rx+MuV;
{
tmpV = *EA;
Ryy = (((size8u_t)Ryy)>>8)|(tmpV<<56);
}
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
U
ICLASS Amode Type s5 Parse y5
N
1 0 0 1 0 i i 0 1 0 0 s s s s s P P i i i i i i i i i y y y y y Ryy=memb_fifo(Rs+#s11:0
)
U
ICLASS Amode Type x5 Parse u1 y5
N
1 0 0 1 1 0 0 0 1 0 0 x x x x x P P u 0 - - 0 i i i i y y y y y Ryy=memb_fifo(Rx++#s4:0
:circ(Mu))
1 0 0 1 1 0 0 0 1 0 0 x x x x x P P u 0 - - 1 - 0 - - y y y y y Ryy=memb_fifo(Rx++I:circ(
Mu))
1 0 0 1 1 0 1 0 1 0 0 x x x x x P P 0 0 - - - i i i i y y y y y Ryy=memb_fifo(Rx++#s4:0
)
U
ICLASS Amode Type t5 Parse y5
N
1 0 0 1 1 1 0 0 1 0 0 t t t t t P P i 1 I I I I i I I y y y y y Ryy=memb_fifo(Rt<<#u2+#
U6)
U
ICLASS Amode Type x5 Parse u1 y5
N
1 0 0 1 1 1 0 0 1 0 0 x x x x x P P u 0 - - - - 0 - - y y y y y Ryy=memb_fifo(Rx++Mu)
1 0 0 1 1 1 1 0 1 0 0 x x x x x P P u 0 - - - - 0 - - y y y y y Ryy=memb_fifo(Rx++Mu:br
ev)
Ryy
Syntax Behavior
Ryy=memh_fifo(Re=#U6) apply_extension(#U);
EA=#U;
{
tmpV = *EA;
Ryy = (((size8u_t)Ryy)>>16)|(tmpV<<48);
}
Re=#U;
Ryy=memh_fifo(Rs) Assembler mapped to:
"Ryy=memh_fifo""(Rs+#0)"
Ryy=memh_fifo(Rs+#s11:1) apply_extension(#s);
EA=Rs+#s;
{
tmpV = *EA;
Ryy = (((size8u_t)Ryy)>>16)|(tmpV<<48);
}
Ryy=memh_fifo(Rt<<#u2+#U6) apply_extension(#U);
EA=#U+(Rt<<#u);
{
tmpV = *EA;
Ryy = (((size8u_t)Ryy)>>16)|(tmpV<<48);
}
Ryy=memh_fifo(Rx++#s4:1) EA=Rx;
Rx=Rx+#s;
{
tmpV = *EA;
Ryy = (((size8u_t)Ryy)>>16)|(tmpV<<48);
}
Ryy=memh_fifo(Rx++#s4:1:circ( EA=Rx;
Mu)) Rx=Rx=circ_add(Rx,#s,MuV);
{
tmpV = *EA;
Ryy = (((size8u_t)Ryy)>>16)|(tmpV<<48);
}
Syntax Behavior
Ryy=memh_fifo(Rx++I:circ(Mu)) EA=Rx;
Rx=Rx=circ_add(Rx,I<<1,MuV);
{
tmpV = *EA;
Ryy = (((size8u_t)Ryy)>>16)|(tmpV<<48);
}
Ryy=memh_fifo(Rx++Mu) EA=Rx;
Rx=Rx+MuV;
{
tmpV = *EA;
Ryy = (((size8u_t)Ryy)>>16)|(tmpV<<48);
}
Ryy=memh_fifo(Rx++Mu:brev) EA=Rx.h[1] | brev(Rx.h[0]);
Rx=Rx+MuV;
{
tmpV = *EA;
Ryy = (((size8u_t)Ryy)>>16)|(tmpV<<48);
}
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
U
ICLASS Amode Type s5 Parse y5
N
1 0 0 1 0 i i 0 0 1 0 s s s s s P P i i i i i i i i i y y y y y Ryy=memh_fifo(Rs+#s11:1
)
U
ICLASS Amode Type x5 Parse u1 y5
N
1 0 0 1 1 0 0 0 0 1 0 x x x x x P P u 0 - - 0 i i i i y y y y y Ryy=memh_fifo(Rx++#s4:1
:circ(Mu))
1 0 0 1 1 0 0 0 0 1 0 x x x x x P P u 0 - - 1 - 0 - - y y y y y Ryy=memh_fifo(Rx++I:circ(
Mu))
1 0 0 1 1 0 1 0 0 1 0 x x x x x P P 0 0 - - - i i i i y y y y y Ryy=memh_fifo(Rx++#s4:1
)
U
ICLASS Amode Type t5 Parse y5
N
1 0 0 1 1 1 0 0 0 1 0 t t t t t P P i 1 I I I I i I I y y y y y Ryy=memh_fifo(Rt<<#u2+#
U6)
U
ICLASS Amode Type x5 Parse u1 y5
N
1 0 0 1 1 1 0 0 0 1 0 x x x x x P P u 0 - - - - 0 - - y y y y y Ryy=memh_fifo(Rx++Mu)
1 0 0 1 1 1 1 0 0 1 0 x x x x x P P u 0 - - - - 0 - - y y y y y Ryy=memh_fifo(Rx++Mu:br
ev)
Load halfword
Load a signed halfword from memory. The 16-bit halfword at the effective address in memory is
placed in the least-significant 16 bits of the destination register. The destination register is then
sign-extended from 16 bits to 32.
Syntax Behavior
Rd=memh(Re=#U6) apply_extension(#U);
EA=#U;
Rd = *EA;
Re=#U;
Rd=memh(Rs+#s11:1) apply_extension(#s);
EA=Rs+#s;
Rd = *EA;
Rd=memh(Rs+Rt<<#u2) EA=Rs+(Rt<<#u);
Rd = *EA;
Rd=memh(Rt<<#u2+#U6) apply_extension(#U);
EA=#U+(Rt<<#u);
Rd = *EA;
Rd=memh(Rx++#s4:1) EA=Rx;
Rx=Rx+#s;
Rd = *EA;
Rd=memh(Rx++#s4:1:circ(Mu)) EA=Rx;
Rx=Rx=circ_add(Rx,#s,MuV);
Rd = *EA;
Rd=memh(Rx++I:circ(Mu)) EA=Rx;
Rx=Rx=circ_add(Rx,I<<1,MuV);
Rd = *EA;
Rd=memh(Rx++Mu) EA=Rx;
Rx=Rx+MuV;
Rd = *EA;
Rd=memh(Rx++Mu:brev) EA=Rx.h[1] | brev(Rx.h[0]);
Rx=Rx+MuV;
Rd = *EA;
Rd=memh(gp+#u16:1) apply_extension(#u);
EA=(Constant_extended ? (0) : GP)+#u;
Rd = *EA;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse t5 d5
0 0 1 1 1 0 1 0 0 1 0 s s s s s P P i t t t t t i - - d d d d d Rd=memh(Rs+Rt<<#u2)
U
ICLASS Type Parse d5
N
0 1 0 0 1 i i 1 0 1 0 i i i i i P P i i i i i i i i i d d d d d Rd=memh(gp+#u16:1)
U
ICLASS Amode Type s5 Parse d5
N
1 0 0 1 0 i i 1 0 1 0 s s s s s P P i i i i i i i i i d d d d d Rd=memh(Rs+#s11:1)
U
ICLASS Amode Type x5 Parse u1 d5
N
Rd=memh(Rx++#s4:1:circ(
1 0 0 1 1 0 0 1 0 1 0 x x x x x P P u 0 - - 0 i i i i d d d d d
Mu))
1 0 0 1 1 0 0 1 0 1 0 x x x x x P P u 0 - - 1 - 0 - - d d d d d Rd=memh(Rx++I:circ(Mu))
U
ICLASS Amode Type e5 Parse d5
N
1 0 0 1 1 0 1 1 0 1 0 e e e e e P P 0 1 I I I I - I I d d d d d Rd=memh(Re=#U6)
Syntax Behavior
if ([!]Pt[.new]) Rd=memh(#u6) apply_extension(#u);
EA=#u;
if ([!]Pt[.new][0]) {
Rd = *EA;
} else {
NOP;
}
if ([!]Pt[.new]) apply_extension(#u);
Rd=memh(Rs+#u6:1) EA=Rs+#u;
if ([!]Pt[.new][0]) {
Rd = *EA;
} else {
NOP;
}
if ([!]Pt[.new]) EA=Rx;
Rd=memh(Rx++#s4:1) if([!]Pt[.new][0]){
Rx=Rx+#s;
Rd = *EA;
} else {
NOP;
}
if ([!]Pv[.new]) EA=Rs+(Rt<<#u);
Rd=memh(Rs+Rt<<#u2) if ([!]Pv[.new][0]) {
Rd = *EA;
} else {
NOP;
}
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse t5 d5
d if
(Pv)
0 0 1 1 0 0 0 0 0 1 0 s s s s s P P i t t t t t i v v d d d d
Rd=memh(Rs+Rt<<#u2)
0 0 1 1 0 0 0 1 0 1 0 s s s s s P P i t t t t t i v v d d d d d if (!Pv)
Rd=memh(Rs+Rt<<#u2)
if (Pv.new)
0 0 1 1 0 0 1 0 0 1 0 s s s s s P P i t t t t t i v v d d d d d Rd=memh(Rs+Rt<<#u2)
d ifRd=memh(Rs+Rt<<#u2)
(!Pv.new)
0 0 1 1 0 0 1 1 0 1 0 s s s s s P P i t t t t t i v v d d d d
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Pr
Se
ed U
ICLASS ns Type s5 Parse t2 d5
Ne N
e
w
0 1 0 0 0 0 0 1 0 1 0 s s s s s P P 0 t t i i i i i i d d d d d if (Pt) Rd=memh(Rs+#u6:1)
if (Pt.new)
0 1 0 0 0 0 1 1 0 1 0 s s s s s P P 0 t t i i i i i i d d d d d
Rd=memh(Rs+#u6:1)
if (!Pt)
0 1 0 0 0 1 0 1 0 1 0 s s s s s P P 0 t t i i i i i i d d d d d
Rd=memh(Rs+#u6:1)
if (!Pt.new)
0 1 0 0 0 1 1 1 0 1 0 s s s s s P P 0 t t i i i i i i d d d d d
Rd=memh(Rs+#u6:1)
U
ICLASS Amode Type x5 Parse t2 d5
N
if (Pt)
1 0 0 1 1 0 1 1 0 1 0 x x x x x P P 1 0 0 t t i i i i d d d d d
Rd=memh(Rx++#s4:1)
if (!Pt)
1 0 0 1 1 0 1 1 0 1 0 x x x x x P P 1 0 1 t t i i i i d d d d d
Rd=memh(Rx++#s4:1)
1 0 0 1 1 0 1 1 0 1 0 x x x x x P P 1 1 0 t t i i i i d d d d d if (Pt.new)
Rd=memh(Rx++#s4:1)
1 0 0 1 1 0 1 1 0 1 0 x x x x x P P 1 1 1 t t i i i i d d d d d if (!Pt.new)
Rd=memh(Rx++#s4:1)
U
ICLASS Amode Type N Parse t2 d5
1 0 0 1 1 1 1 1 0 1 0 i i i i i P P 1 0 0 t t i 1 - - d d d d d if (Pt) Rd=memh(#u6)
1 0 0 1 1 1 1 1 0 1 0 i i i i i P P 1 0 1 t t i 1 - - d d d d d if (!Pt) Rd=memh(#u6)
1 0 0 1 1 1 1 1 0 1 0 i i i i i P P 1 1 0 t t i 1 - - d d d d d if (Pt.new) Rd=memh(#u6)
1 0 0 1 1 1 1 1 0 1 0 i i i i i P P 1 1 1 t t i 1 - - d d d d d if (!Pt.new) Rd=memh(#u6)
Memory copy
Copy Mu + 1 (length) bytes from the address in Rt (source base) to the address in Rs (destination
base). The source base, destination base, and length values must be aligned to the L2 cache-line
size. Behavior is undefined for non-aligned values and for source or destination buffers partially in
illegal space. The accesses by the memcpy instruction are noncoherent with the cache-hierarchy
of the Q6.
In addition to normal translation exceptions, a coprocessor memory exception occurs if any of the
following are true:
■ Source or destination base address in illegal space
■ Source or destination buffer crosses a page boundary
■ Source base address is NOT in AXI space
■ Destination base address is NOT in VTCM
This instruction is only available on cores with VTCM.
Syntax Behavior
Rdd=pmemcpy(Rx,Rtt)
Notes
■ This is a solo instruction. It must not be grouped with other instructions in a packet.
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
U
ICLASS Amode Type t5 Parse x5 d5
N
1 0 0 1 1 0 0 1 1 1 1 t t t t t P P 0 x x x x x 0 0 0 d d d d d Rdd=pmemcpy(Rx,Rtt)
Class: CR (slot 3)
Notes
■ This is a solo instruction. It must not be grouped with other instructions in a packet.
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS sm t5 Parse s5 d5
0 1 1 0 1 1 1 1 1 1 1 t t t t t P P 0 s s s s s 0 1 0 d d d d d Rd=movlen(Rs,Rtt)
Syntax Behavior
Rd=memub(Re=#U6) apply_extension(#U);
EA=#U;
Rd = *EA;
Re=#U;
Rd=memub(Rs+#s11:0) apply_extension(#s);
EA=Rs+#s;
Rd = *EA;
Rd=memub(Rs+Rt<<#u2) EA=Rs+(Rt<<#u);
Rd = *EA;
Rd=memub(Rt<<#u2+#U6) apply_extension(#U);
EA=#U+(Rt<<#u);
Rd = *EA;
Rd=memub(Rx++#s4:0) EA=Rx;
Rx=Rx+#s;
Rd = *EA;
Rd=memub(Rx++#s4:0:circ(Mu)) EA=Rx;
Rx=Rx=circ_add(Rx,#s,MuV);
Rd = *EA;
Rd=memub(Rx++I:circ(Mu)) EA=Rx;
Rx=Rx=circ_add(Rx,I<<0,MuV);
Rd = *EA;
Rd=memub(Rx++Mu) EA=Rx;
Rx=Rx+MuV;
Rd = *EA;
Rd=memub(Rx++Mu:brev) EA=Rx.h[1] | brev(Rx.h[0]);
Rx=Rx+MuV;
Rd = *EA;
Rd=memub(gp+#u16:0) apply_extension(#u);
EA=(Constant_extended ? (0) : GP)+#u;
Rd = *EA;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse t5 d5
0 0 1 1 1 0 1 0 0 0 1 s s s s s P P i t t t t t i - - d d d d d Rd=memub(Rs+Rt<<#u2)
U
ICLASS Type Parse d5
N
0 1 0 0 1 i i 1 0 0 1 i i i i i P P i i i i i i i i i d d d d d Rd=memub(gp+#u16:0)
U
ICLASS Amode Type s5 Parse d5
N
1 0 0 1 0 i i 1 0 0 1 s s s s s P P i i i i i i i i i d d d d d Rd=memub(Rs+#s11:0)
U
ICLASS Amode Type x5 Parse u1 d5
N
Rd=memub(Rx++#s4:0:circ
1 0 0 1 1 0 0 1 0 0 1 x x x x x P P u 0 - - 0 i i i i d d d d d
(Mu))
1 0 0 1 1 0 0 1 0 0 1 x x x x x P P u 0 - - 1 - 0 - - d d d d d Rd=memub(Rx++I:circ(Mu)
)
U
ICLASS Amode Type e5 Parse d5
N
1 0 0 1 1 0 1 1 0 0 1 e e e e e P P 0 1 I I I I - I I d d d d d Rd=memub(Re=#U6)
U
ICLASS Amode Type N x5 Parse d5
1 0 0 1 1 0 1 1 0 0 1 x x x x x P P 0 0 - - - i i i i d d d d d Rd=memub(Rx++#s4:0)
U
ICLASS Amode Type N
t5 Parse d5
1 0 0 1 1 1 0 1 0 0 1 t t t t t P P i 1 I I I I i I I d d d d d Rd=memub(Rt<<#u2+#U6)
U
ICLASS Amode Type x5 Parse u1 d5
N
1 0 0 1 1 1 0 1 0 0 1 x x x x x P P u 0 - - - - 0 - - d d d d d Rd=memub(Rx++Mu)
1 0 0 1 1 1 1 1 0 0 1 x x x x x P P u 0 - - - - 0 - - d d d d d Rd=memub(Rx++Mu:brev)
Syntax Behavior
if ([!]Pt[.new]) apply_extension(#u);
Rd=memub(#u6) EA=#u;
if ([!]Pt[.new][0]) {
Rd = *EA;
} else {
NOP;
}
if ([!]Pt[.new]) apply_extension(#u);
Rd=memub(Rs+#u6:0) EA=Rs+#u;
if ([!]Pt[.new][0]) {
Rd = *EA;
} else {
NOP;
}
if ([!]Pt[.new]) EA=Rx;
Rd=memub(Rx++#s4:0) if([!]Pt[.new][0]){
Rx=Rx+#s;
Rd = *EA;
} else {
NOP;
}
if ([!]Pv[.new]) EA=Rs+(Rt<<#u);
Rd=memub(Rs+Rt<<#u2) if ([!]Pv[.new][0]) {
Rd = *EA;
} else {
NOP;
}
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse t5 d5
d if
(Pv)
0 0 1 1 0 0 0 0 0 0 1 s s s s s P P i t t t t t i v v d d d d
Rd=memub(Rs+Rt<<#u2)
0 0 1 1 0 0 0 1 0 0 1 s s s s s P P i t t t t t i v v d d d d d if (!Pv)
Rd=memub(Rs+Rt<<#u2)
if (Pv.new)
0 0 1 1 0 0 1 0 0 0 1 s s s s s P P i t t t t t i v v d d d d d Rd=memub(Rs+Rt<<#u2)
d ifRd=memub(Rs+Rt<<#u2)
(!Pv.new)
0 0 1 1 0 0 1 1 0 0 1 s s s s s P P i t t t t t i v v d d d d
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Pr
Se
ed U
ICLASS ns Type s5 Parse t2 d5
Ne N
e
w
0 1 0 0 0 0 0 1 0 0 1 s s s s s P P 0 t t i i i i i i d d d d d if (Pt)
Rd=memub(Rs+#u6:0)
if (Pt.new)
0 1 0 0 0 0 1 1 0 0 1 s s s s s P P 0 t t i i i i i i d d d d d
Rd=memub(Rs+#u6:0)
0 1 0 0 0 1 0 1 0 0 1 s s s s s P P 0 t t i i i i i i d d d d d if (!Pt)
Rd=memub(Rs+#u6:0)
if (!Pt.new)
0 1 0 0 0 1 1 1 0 0 1 s s s s s P P 0 t t i i i i i i d d d d d
Rd=memub(Rs+#u6:0)
U
ICLASS Amode Type x5 Parse t2 d5
N
if (Pt)
1 0 0 1 1 0 1 1 0 0 1 x x x x x P P 1 0 0 t t i i i i d d d d d
Rd=memub(Rx++#s4:0)
1 0 0 1 1 0 1 1 0 0 1 x x x x x P P 1 0 1 t t i i i i d d d d d if (!Pt)
Rd=memub(Rx++#s4:0)
if (Pt.new)
1 0 0 1 1 0 1 1 0 0 1 x x x x x P P 1 1 0 t t i i i i d d d d d Rd=memub(Rx++#s4:0)
1 0 0 1 1 0 1 1 0 0 1 x x x x x P P 1 1 1 t t i i i i d d d d d if (!Pt.new)
Rd=memub(Rx++#s4:0)
U
ICLASS Amode Type Parse t2 d5
N
1 0 0 1 1 1 1 1 0 0 1 i i i i i P P 1 0 0 t t i 1 - - d d d d d if (Pt) Rd=memub(#u6)
1 0 0 1 1 1 1 1 0 0 1 i i i i i P P 1 0 1 t t i 1 - - d d d d d if (!Pt) Rd=memub(#u6)
1 0 0 1 1 1 1 1 0 0 1 i i i i i P P 1 1 0 t t i 1 - - d d d d d if (Pt.new) Rd=memub(#u6)
if (!Pt.new)
1 0 0 1 1 1 1 1 0 0 1 i i i i i P P 1 1 1 t t i 1 - - d d d d d Rd=memub(#u6)
Syntax Behavior
Rd=memuh(Re=#U6) apply_extension(#U);
EA=#U;
Rd = *EA;
Re=#U;
Rd=memuh(Rs+#s11:1) apply_extension(#s);
EA=Rs+#s;
Rd = *EA;
Rd=memuh(Rs+Rt<<#u2) EA=Rs+(Rt<<#u);
Rd = *EA;
Rd=memuh(Rt<<#u2+#U6) apply_extension(#U);
EA=#U+(Rt<<#u);
Rd = *EA;
Rd=memuh(Rx++#s4:1) EA=Rx;
Rx=Rx+#s;
Rd = *EA;
Rd=memuh(Rx++#s4:1:circ(Mu)) EA=Rx;
Rx=Rx=circ_add(Rx,#s,MuV);
Rd = *EA;
Rd=memuh(Rx++I:circ(Mu)) EA=Rx;
Rx=Rx=circ_add(Rx,I<<1,MuV);
Rd = *EA;
Rd=memuh(Rx++Mu) EA=Rx;
Rx=Rx+MuV;
Rd = *EA;
Rd=memuh(Rx++Mu:brev) EA=Rx.h[1] | brev(Rx.h[0]);
Rx=Rx+MuV;
Rd = *EA;
Rd=memuh(gp+#u16:1) apply_extension(#u);
EA=(Constant_extended ? (0) : GP)+#u;
Rd = *EA;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse t5 d5
0 0 1 1 1 0 1 0 0 1 1 s s s s s P P i t t t t t i - - d d d d d Rd=memuh(Rs+Rt<<#u2)
U
ICLASS Type Parse d5
N
0 1 0 0 1 i i 1 0 1 1 i i i i i P P i i i i i i i i i d d d d d Rd=memuh(gp+#u16:1)
U
ICLASS Amode Type s5 Parse d5
N
1 0 0 1 0 i i 1 0 1 1 s s s s s P P i i i i i i i i i d d d d d Rd=memuh(Rs+#s11:1)
U
ICLASS Amode Type x5 Parse u1 d5
N
Rd=memuh(Rx++#s4:1:circ
1 0 0 1 1 0 0 1 0 1 1 x x x x x P P u 0 - - 0 i i i i d d d d d
(Mu))
1 0 0 1 1 0 0 1 0 1 1 x x x x x P P u 0 - - 1 - 0 - - d d d d d Rd=memuh(Rx++I:circ(Mu)
)
U
ICLASS Amode Type e5 Parse d5
N
1 0 0 1 1 0 1 1 0 1 1 e e e e e P P 0 1 I I I I - I I d d d d d Rd=memuh(Re=#U6)
U
ICLASS Amode Type N x5 Parse d5
1 0 0 1 1 0 1 1 0 1 1 x x x x x P P 0 0 - - - i i i i d d d d d Rd=memuh(Rx++#s4:1)
U
ICLASS Amode Type N
t5 Parse d5
1 0 0 1 1 1 0 1 0 1 1 t t t t t P P i 1 I I I I i I I d d d d d Rd=memuh(Rt<<#u2+#U6)
U
ICLASS Amode Type x5 Parse u1 d5
N
1 0 0 1 1 1 0 1 0 1 1 x x x x x P P u 0 - - - - 0 - - d d d d d Rd=memuh(Rx++Mu)
1 0 0 1 1 1 1 1 0 1 1 x x x x x P P u 0 - - - - 0 - - d d d d d Rd=memuh(Rx++Mu:brev)
Syntax Behavior
if ([!]Pt[.new]) apply_extension(#u);
Rd=memuh(#u6) EA=#u;
if ([!]Pt[.new][0]) {
Rd = *EA;
} else {
NOP;
}
if ([!]Pt[.new]) apply_extension(#u);
Rd=memuh(Rs+#u6:1) EA=Rs+#u;
if ([!]Pt[.new][0]) {
Rd = *EA;
} else {
NOP;
}
if ([!]Pt[.new]) EA=Rx;
Rd=memuh(Rx++#s4:1) if([!]Pt[.new][0]){
Rx=Rx+#s;
Rd = *EA;
} else {
NOP;
}
if ([!]Pv[.new]) EA=Rs+(Rt<<#u);
Rd=memuh(Rs+Rt<<#u2) if ([!]Pv[.new][0]) {
Rd = *EA;
} else {
NOP;
}
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse t5 d5
d if
(Pv)
0 0 1 1 0 0 0 0 0 1 1 s s s s s P P i t t t t t i v v d d d d
Rd=memuh(Rs+Rt<<#u2)
0 0 1 1 0 0 0 1 0 1 1 s s s s s P P i t t t t t i v v d d d d d if (!Pv)
Rd=memuh(Rs+Rt<<#u2)
if (Pv.new)
0 0 1 1 0 0 1 0 0 1 1 s s s s s P P i t t t t t i v v d d d d d Rd=memuh(Rs+Rt<<#u2)
d ifRd=memuh(Rs+Rt<<#u2)
(!Pv.new)
0 0 1 1 0 0 1 1 0 1 1 s s s s s P P i t t t t t i v v d d d d
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Pr
Se
ed U
ICLASS ns Type s5 Parse t2 d5
Ne N
e
w
0 1 0 0 0 0 0 1 0 1 1 s s s s s P P 0 t t i i i i i i d d d d d if (Pt)
Rd=memuh(Rs+#u6:1)
if (Pt.new)
0 1 0 0 0 0 1 1 0 1 1 s s s s s P P 0 t t i i i i i i d d d d d
Rd=memuh(Rs+#u6:1)
0 1 0 0 0 1 0 1 0 1 1 s s s s s P P 0 t t i i i i i i d d d d d if (!Pt)
Rd=memuh(Rs+#u6:1)
if (!Pt.new)
0 1 0 0 0 1 1 1 0 1 1 s s s s s P P 0 t t i i i i i i d d d d d
Rd=memuh(Rs+#u6:1)
U
ICLASS Amode Type x5 Parse t2 d5
N
if (Pt)
1 0 0 1 1 0 1 1 0 1 1 x x x x x P P 1 0 0 t t i i i i d d d d d
Rd=memuh(Rx++#s4:1)
1 0 0 1 1 0 1 1 0 1 1 x x x x x P P 1 0 1 t t i i i i d d d d d if (!Pt)
Rd=memuh(Rx++#s4:1)
if (Pt.new)
1 0 0 1 1 0 1 1 0 1 1 x x x x x P P 1 1 0 t t i i i i d d d d d Rd=memuh(Rx++#s4:1)
1 0 0 1 1 0 1 1 0 1 1 x x x x x P P 1 1 1 t t i i i i d d d d d if (!Pt.new)
Rd=memuh(Rx++#s4:1)
U
ICLASS Amode Type Parse t2 d5
N
1 0 0 1 1 1 1 1 0 1 1 i i i i i P P 1 0 0 t t i 1 - - d d d d d if (Pt) Rd=memuh(#u6)
1 0 0 1 1 1 1 1 0 1 1 i i i i i P P 1 0 1 t t i 1 - - d d d d d if (!Pt) Rd=memuh(#u6)
1 0 0 1 1 1 1 1 0 1 1 i i i i i P P 1 1 0 t t i 1 - - d d d d d if (Pt.new) Rd=memuh(#u6)
if (!Pt.new)
1 0 0 1 1 1 1 1 0 1 1 i i i i i P P 1 1 1 t t i 1 - - d d d d d Rd=memuh(#u6)
Load word
Load a 32-bit word from memory and place in a destination register.
Syntax Behavior
Rd=memw(Re=#U6) apply_extension(#U);
EA=#U;
Rd = *EA;
Re=#U;
Rd=memw(Rs+#s11:2) apply_extension(#s);
EA=Rs+#s;
Rd = *EA;
Rd=memw(Rs+Rt<<#u2) EA=Rs+(Rt<<#u);
Rd = *EA;
Rd=memw(Rt<<#u2+#U6) apply_extension(#U);
EA=#U+(Rt<<#u);
Rd = *EA;
Rd=memw(Rx++#s4:2) EA=Rx;
Rx=Rx+#s;
Rd = *EA;
Rd=memw(Rx++#s4:2:circ(Mu)) EA=Rx;
Rx=Rx=circ_add(Rx,#s,MuV);
Rd = *EA;
Rd=memw(Rx++I:circ(Mu)) EA=Rx;
Rx=Rx=circ_add(Rx,I<<2,MuV);
Rd = *EA;
Rd=memw(Rx++Mu) EA=Rx;
Rx=Rx+MuV;
Rd = *EA;
Rd=memw(Rx++Mu:brev) EA=Rx.h[1] | brev(Rx.h[0]);
Rx=Rx+MuV;
Rd = *EA;
Rd=memw(gp+#u16:2) apply_extension(#u);
EA=(Constant_extended ? (0) : GP)+#u;
Rd = *EA;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse t5 d5
0 0 1 1 1 0 1 0 1 0 0 s s s s s P P i t t t t t i - - d d d d d Rd=memw(Rs+Rt<<#u2)
U
ICLASS Type Parse d5
N
0 1 0 0 1 i i 1 1 0 0 i i i i i P P i i i i i i i i i d d d d d Rd=memw(gp+#u16:2)
U
ICLASS Amode Type s5 Parse d5
N
1 0 0 1 0 i i 1 1 0 0 s s s s s P P i i i i i i i i i d d d d d Rd=memw(Rs+#s11:2)
U
ICLASS Amode Type x5 Parse u1 d5
N
Rd=memw(Rx++#s4:2:circ(
1 0 0 1 1 0 0 1 1 0 0 x x x x x P P u 0 - - 0 i i i i d d d d d
Mu))
1 0 0 1 1 0 0 1 1 0 0 x x x x x P P u 0 - - 1 - 0 - - d d d d d Rd=memw(Rx++I:circ(Mu))
U
ICLASS Amode Type e5 Parse d5
N
1 0 0 1 1 0 1 1 1 0 0 e e e e e P P 0 1 I I I I - I I d d d d d Rd=memw(Re=#U6)
Load-acquire word
Load a 32-bit word from memory and place in a destination register. The load-acquire memory
operation is observed before any following memory operations (in program order) are observed
at the local point of serialization. A different order can be observed at the global point of
serialization (see Ordering and Synchronization).
Syntax Behavior
Rd=memw_aq(Rs) EA=Rs;
Rd = *EA
Class: LD (slots 0)
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
U
ICLASS Amode Type s5 Parse d5
N
1 0 0 1 0 0 1 0 0 0 0 s s s s s P P 0 0 1 - - - 0 0 0 d d d d d Rd=memw_aq(Rs)
Syntax Behavior
if ([!]Pt[.new]) Rd=memw(#u6) apply_extension(#u);
EA=#u;
if ([!]Pt[.new][0]) {
Rd = *EA;
} else {
NOP;
}
if ([!]Pt[.new]) Rd=memw(Rs+#u6:2) apply_extension(#u);
EA=Rs+#u;
if ([!]Pt[.new][0]) {
Rd = *EA;
} else {
NOP;
}
if ([!]Pt[.new]) Rd=memw(Rx++#s4:2) EA=Rx;
if([!]Pt[.new][0]){
Rx=Rx+#s;
Rd = *EA;
} else {
NOP;
}
if ([!]Pv[.new]) Rd=memw(Rs+Rt<<#u2) EA=Rs+(Rt<<#u);
if ([!]Pv[.new][0]) {
Rd = *EA;
} else {
NOP;
}
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse t5 d5
if (Pv)
0 0 1 1 0 0 0 0 1 0 0 s s s s s P P i t t t t t i v v d d d d d Rd=memw(Rs+Rt<<#u2)
0 0 1 1 0 0 0 1 1 0 0 s s s s s P P i t t t t t i v v d d d d d if (!Pv)
Rd=memw(Rs+Rt<<#u2)
0 0 1 1 0 0 1 0 1 0 0 s s s s s P P i t t t t t i v v d d d d d if (Pv.new)
Rd=memw(Rs+Rt<<#u2)
if (!Pv.new)
0 0 1 1 0 0 1 1 1 0 0 s s s s s P P i t t t t t i v v d d d d d Rd=memw(Rs+Rt<<#u2)
Pr
Se ed
U
ICLASS ns Ne Type s5 Parse t2 d5
N
e
w
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
if (Pt)
0 1 0 0 0 0 0 1 1 0 0 s s s s s P P 0 t t i i i i i i d d d d d
Rd=memw(Rs+#u6:2)
0 1 0 0 0 0 1 1 1 0 0 s s s s s P P 0 t t i i i i i i d d d d d if (Pt.new)
Rd=memw(Rs+#u6:2)
if (!Pt)
0 1 0 0 0 1 0 1 1 0 0 s s s s s P P 0 t t i i i i i i d d d d d
Rd=memw(Rs+#u6:2)
if (!Pt.new)
0 1 0 0 0 1 1 1 1 0 0 s s s s s P P 0 t t i i i i i i d d d d d Rd=memw(Rs+#u6:2)
U
ICLASS Amode Type x5 Parse t2 d5
N
if (Pt)
1 0 0 1 1 0 1 1 1 0 0 x x x x x P P 1 0 0 t t i i i i d d d d d
Rd=memw(Rx++#s4:2)
if (!Pt)
1 0 0 1 1 0 1 1 1 0 0 x x x x x P P 1 0 1 t t i i i i d d d d d
Rd=memw(Rx++#s4:2)
1 0 0 1 1 0 1 1 1 0 0 x x x x x P P 1 1 0 t t i i i i d d d d d if (Pt.new)
Rd=memw(Rx++#s4:2)
if (!Pt.new)
1 0 0 1 1 0 1 1 1 0 0 x x x x x P P 1 1 1 t t i i i i d d d d d
Rd=memw(Rx++#s4:2)
U
ICLASS Amode Type N Parse t2 d5
1 0 0 1 1 1 1 1 1 0 0 i i i i i P P 1 0 0 t t i 1 - - d d d d d if (Pt) Rd=memw(#u6)
1 0 0 1 1 1 1 1 1 0 0 i i i i i P P 1 0 1 t t i 1 - - d d d d d if (!Pt) Rd=memw(#u6)
1 0 0 1 1 1 1 1 1 0 0 i i i i i P P 1 1 0 t t i 1 - - d d d d d if (Pt.new) Rd=memw(#u6)
1 0 0 1 1 1 1 1 1 0 0 i i i i i P P 1 1 1 t t i 1 - - d d d d d if (!Pt.new) Rd=memw(#u6)
Saved LR
Saved FP
Higher address
Procedure local
data on stack
Stack frame
Saved LR
Saved FP FP register
Procedure local
data on stack
SP register
Lower address
Unallocated stack
Syntax Behavior
Rdd=deallocframe(Rs):raw EA=Rs;
tmp = *EA;
Rdd = frame_unscramble(tmp);
SP=EA+8;
deallocframe Assembler mapped to: "r31:30=deallocframe(r30):raw"
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
U
ICLASS Amode Type s5 Parse d5
N
1 0 0 1 0 0 0 0 0 0 0 s s s s s P P 0 - - - - - - - - d d d d d Rdd=deallocframe(Rs):raw
Syntax Behavior
Rdd=dealloc_return(Rs):raw EA=Rs;
tmp = *EA;
Rdd = frame_unscramble(tmp);
SP=EA+8;
PC=Rdd.w[1];
dealloc_return Assembler mapped to:
"r31:30=dealloc_return(r30):raw"
if ([!]Pv) Rdd=dealloc_return(Rs):raw EA=Rs;
if ([!]Pv[0]) {
tmp = *EA;
Rdd = frame_unscramble(tmp);
SP=EA+8;
PC=Rdd.w[1];
} else {
NOP;
}
if ([!]Pv) dealloc_return Assembler mapped to: "if ([!]Pv"")
r31:30=dealloc_return(r30)"":raw"
if ([!]Pv.new) EA=Rs;
Rdd=dealloc_return(Rs):nt:raw if ([!]Pv.new[0]) {
tmp = *EA;
Rdd = frame_unscramble(tmp);
SP=EA+8;
PC=Rdd.w[1];
} else {
NOP;
}
if ([!]Pv.new) EA=Rs;
Rdd=dealloc_return(Rs):t:raw if ([!]Pv.new[0]) {
tmp = *EA;
Rdd = frame_unscramble(tmp);
SP=EA+8;
PC=Rdd.w[1];
} else {
NOP;
}
if ([!]Pv.new) dealloc_return:nt Assembler mapped to: "if
([!]Pv"".new"")
r31:30=dealloc_return(r30)"":nt"":raw"
if ([!]Pv.new) dealloc_return:t Assembler mapped to: "if
([!]Pv"".new"")
r31:30=dealloc_return(r30)"":t"":raw"
Class: LD (slots 0)
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
U
ICLASS Amode Type s5 Parse d5
N
1 0 0 1 0 1 1 0 0 0 0 s s s s s P P 0 0 0 0 - - - - - d d d d d Rdd=dealloc_return(Rs):ra
w
if (Pv.new)
1 0 0 1 0 1 1 0 0 0 0 s s s s s P P 0 0 1 0 v v - - - d d d d d Rdd=dealloc_return(Rs):nt:
raw
if (Pv)
1 0 0 1 0 1 1 0 0 0 0 s s s s s P P 0 1 0 0 v v - - - d d d d d Rdd=dealloc_return(Rs):ra
w
if (Pv.new)
1 0 0 1 0 1 1 0 0 0 0 s s s s s P P 0 1 1 0 v v - - - d d d d d Rdd=dealloc_return(Rs):t:r
aw
if (!Pv.new)
1 0 0 1 0 1 1 0 0 0 0 s s s s s P P 1 0 1 0 v v - - - d d d d d Rdd=dealloc_return(Rs):nt:
raw
if (!Pv)
1 0 0 1 0 1 1 0 0 0 0 s s s s s P P 1 1 0 0 v v - - - d d d d d Rdd=dealloc_return(Rs):ra
w
if (!Pv.new)
1 0 0 1 0 1 1 0 0 0 0 s s s s s P P 1 1 1 0 v v - - - d d d d d Rdd=dealloc_return(Rs):t:r
aw
mem
Rd=memubh(amode)
mem
Syntax Behavior
Rd=membh(Re=#U6) apply_extension(#U);
EA=#U;
{
tmpV = *EA;
for (i=0;i<2;i++) {
Rd.h[i]=tmpV.b[i];
}
}
Re=#U;
Rd=membh(Rs) Assembler mapped to: "Rd=membh""(Rs+#0)"
Rd=membh(Rs+#s11:1) apply_extension(#s);
EA=Rs+#s;
{
tmpV = *EA;
for (i=0;i<2;i++) {
Rd.h[i]=tmpV.b[i];
}
}
Rd=membh(Rt<<#u2+#U6) apply_extension(#U);
EA=#U+(Rt<<#u);
{
tmpV = *EA;
for (i=0;i<2;i++) {
Rd.h[i]=tmpV.b[i];
}
}
Syntax Behavior
Rd=membh(Rx++#s4:1) EA=Rx;
Rx=Rx+#s;
{
tmpV = *EA;
for (i=0;i<2;i++) {
Rd.h[i]=tmpV.b[i];
}
}
Rd=membh(Rx++#s4:1:circ(Mu)) EA=Rx;
Rx=Rx=circ_add(Rx,#s,MuV);
{
tmpV = *EA;
for (i=0;i<2;i++) {
Rd.h[i]=tmpV.b[i];
}
}
Rd=membh(Rx++I:circ(Mu)) EA=Rx;
Rx=Rx=circ_add(Rx,I<<1,MuV);
{
tmpV = *EA;
for (i=0;i<2;i++) {
Rd.h[i]=tmpV.b[i];
}
}
Rd=membh(Rx++Mu) EA=Rx;
Rx=Rx+MuV;
{
tmpV = *EA;
for (i=0;i<2;i++) {
Rd.h[i]=tmpV.b[i];
}
}
Rd=membh(Rx++Mu:brev) EA=Rx.h[1] | brev(Rx.h[0]);
Rx=Rx+MuV;
{
tmpV = *EA;
for (i=0;i<2;i++) {
Rd.h[i]=tmpV.b[i];
}
}
Rd=memubh(Re=#U6) apply_extension(#U);
EA=#U;
{
tmpV = *EA;
for (i=0;i<2;i++) {
Rd.h[i]=tmpV.ub[i];
}
}
Re=#U;
Syntax Behavior
Rd=memubh(Rs+#s11:1) apply_extension(#s);
EA=Rs+#s;
{
tmpV = *EA;
for (i=0;i<2;i++) {
Rd.h[i]=tmpV.ub[i];
}
}
Rd=memubh(Rt<<#u2+#U6) apply_extension(#U);
EA=#U+(Rt<<#u);
{
tmpV = *EA;
for (i=0;i<2;i++) {
Rd.h[i]=tmpV.ub[i];
}
}
Rd=memubh(Rx++#s4:1) EA=Rx;
Rx=Rx+#s;
{
tmpV = *EA;
for (i=0;i<2;i++) {
Rd.h[i]=tmpV.ub[i];
}
}
Rd=memubh(Rx++#s4:1:circ(Mu) EA=Rx;
) Rx=Rx=circ_add(Rx,#s,MuV);
{
tmpV = *EA;
for (i=0;i<2;i++) {
Rd.h[i]=tmpV.ub[i];
}
}
Rd=memubh(Rx++I:circ(Mu)) EA=Rx;
Rx=Rx=circ_add(Rx,I<<1,MuV);
{
tmpV = *EA;
for (i=0;i<2;i++) {
Rd.h[i]=tmpV.ub[i];
}
}
Rd=memubh(Rx++Mu) EA=Rx;
Rx=Rx+MuV;
{
tmpV = *EA;
for (i=0;i<2;i++) {
Rd.h[i]=tmpV.ub[i];
}
}
Syntax Behavior
Rd=memubh(Rx++Mu:brev) EA=Rx.h[1] | brev(Rx.h[0]);
Rx=Rx+MuV;
{
tmpV = *EA;
for (i=0;i<2;i++) {
Rd.h[i]=tmpV.ub[i];
}
}
Rdd=membh(Re=#U6) apply_extension(#U);
EA=#U;
{
tmpV = *EA;
for (i=0;i<4;i++) {
Rdd.h[i]=tmpV.b[i];
}
}
Re=#U;
Rdd=membh(Rs) Assembler mapped to: "Rdd=membh""(Rs+#0)"
Rdd=membh(Rs+#s11:2) apply_extension(#s);
EA=Rs+#s;
{
tmpV = *EA;
for (i=0;i<4;i++) {
Rdd.h[i]=tmpV.b[i];
}
}
Rdd=membh(Rt<<#u2+#U6) apply_extension(#U);
EA=#U+(Rt<<#u);
{
tmpV = *EA;
for (i=0;i<4;i++) {
Rdd.h[i]=tmpV.b[i];
}
}
Rdd=membh(Rx++#s4:2) EA=Rx;
Rx=Rx+#s;
{
tmpV = *EA;
for (i=0;i<4;i++) {
Rdd.h[i]=tmpV.b[i];
}
}
Rdd=membh(Rx++#s4:2:circ(Mu) EA=Rx;
) Rx=Rx=circ_add(Rx,#s,MuV);
{
tmpV = *EA;
for (i=0;i<4;i++) {
Rdd.h[i]=tmpV.b[i];
}
}
Syntax Behavior
Rdd=membh(Rx++I:circ(Mu)) EA=Rx;
Rx=Rx=circ_add(Rx,I<<2,MuV);
{
tmpV = *EA;
for (i=0;i<4;i++) {
Rdd.h[i]=tmpV.b[i];
}
}
Rdd=membh(Rx++Mu) EA=Rx;
Rx=Rx+MuV;
{
tmpV = *EA;
for (i=0;i<4;i++) {
Rdd.h[i]=tmpV.b[i];
}
}
Rdd=membh(Rx++Mu:brev) EA=Rx.h[1] | brev(Rx.h[0]);
Rx=Rx+MuV;
{
tmpV = *EA;
for (i=0;i<4;i++) {
Rdd.h[i]=tmpV.b[i];
}
}
Rdd=memubh(Re=#U6) apply_extension(#U);
EA=#U;
{
tmpV = *EA;
for (i=0;i<4;i++) {
Rdd.h[i]=tmpV.ub[i];
}
}
Re=#U;
Rdd=memubh(Rs+#s11:2) apply_extension(#s);
EA=Rs+#s;
{
tmpV = *EA;
for (i=0;i<4;i++) {
Rdd.h[i]=tmpV.ub[i];
}
}
Rdd=memubh(Rt<<#u2+#U6) apply_extension(#U);
EA=#U+(Rt<<#u);
{
tmpV = *EA;
for (i=0;i<4;i++) {
Rdd.h[i]=tmpV.ub[i];
}
}
Syntax Behavior
Rdd=memubh(Rx++#s4:2) EA=Rx;
Rx=Rx+#s;
{
tmpV = *EA;
for (i=0;i<4;i++) {
Rdd.h[i]=tmpV.ub[i];
}
}
Rdd=memubh(Rx++#s4:2:circ(Mu EA=Rx;
)) Rx=Rx=circ_add(Rx,#s,MuV);
{
tmpV = *EA;
for (i=0;i<4;i++) {
Rdd.h[i]=tmpV.ub[i];
}
}
Rdd=memubh(Rx++I:circ(Mu)) EA=Rx;
Rx=Rx=circ_add(Rx,I<<2,MuV);
{
tmpV = *EA;
for (i=0;i<4;i++) {
Rdd.h[i]=tmpV.ub[i];
}
}
Rdd=memubh(Rx++Mu) EA=Rx;
Rx=Rx+MuV;
{
tmpV = *EA;
for (i=0;i<4;i++) {
Rdd.h[i]=tmpV.ub[i];
}
}
Rdd=memubh(Rx++Mu:brev) EA=Rx.h[1] | brev(Rx.h[0]);
Rx=Rx+MuV;
{
tmpV = *EA;
for (i=0;i<4;i++) {
Rdd.h[i]=tmpV.ub[i];
}
}
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
U
ICLASS Amode Type N s5 Parse d5
1 0 0 1 0 i i 0 0 0 1 s s s s s P P i i i i i i i i i d d d d d Rd=membh(Rs+#s11:1)
1 0 0 1 0 i i 0 0 1 1 s s s s s P P i i i i i i i i i d d d d d Rd=memubh(Rs+#s11:1)
1 0 0 1 0 i i 0 1 0 1 s s s s s P P i i i i i i i i i d d d d d Rdd=memubh(Rs+#s11:2)
1 0 0 1 0 i i 0 1 1 1 s s s s s P P i i i i i i i i i d d d d d Rdd=membh(Rs+#s11:2)
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
U
ICLASS Amode Type x5 Parse u1 d5
N
1 0 0 1 1 0 0 0 0 0 1 x x x x x P P u 0 - - 0 i i i i d d d d d Rd=membh(Rx++#s4:1:circ
(Mu))
Rd=membh(Rx++I:circ(Mu)
1 0 0 1 1 0 0 0 0 0 1 x x x x x P P u 0 - - 1 - 0 - - d d d d d
)
Rd=memubh(Rx++#s4:1:cir
1 0 0 1 1 0 0 0 0 1 1 x x x x x P P u 0 - - 0 i i i i d d d d d c(Mu))
Rd=memubh(Rx++I:circ(Mu
1 0 0 1 1 0 0 0 0 1 1 x x x x x P P u 0 - - 1 - 0 - - d d d d d
))
Rdd=memubh(Rx++#s4:2:c
1 0 0 1 1 0 0 0 1 0 1 x x x x x P P u 0 - - 0 i i i i d d d d d
irc(Mu))
Rdd=memubh(Rx++I:circ(M
1 0 0 1 1 0 0 0 1 0 1 x x x x x P P u 0 - - 1 - 0 - - d d d d d
u))
1 0 0 1 1 0 0 0 1 1 1 x x x x x P P u 0 - - 0 i i i i d d d d d Rdd=membh(Rx++#s4:2:cir
c(Mu))
Rdd=membh(Rx++I:circ(Mu
1 0 0 1 1 0 0 0 1 1 1 x x x x x P P u 0 - - 1 - 0 - - d d d d d
))
U
ICLASS Amode Type N e5 Parse d5
1 0 0 1 1 0 1 0 0 0 1 e e e e e P P 0 1 I I I I - I I d d d d d Rd=membh(Re=#U6)
U
ICLASS Amode Type x5 Parse d5
N
1 0 0 1 1 0 1 0 0 0 1 x x x x x P P 0 0 - - - i i i i d d d d d Rd=membh(Rx++#s4:1)
U
ICLASS Amode Type e5 Parse d5
N
1 0 0 1 1 0 1 0 0 1 1 e e e e e P P 0 1 I I I I - I I d d d d d Rd=memubh(Re=#U6)
U
ICLASS Amode Type x5 Parse d5
N
1 0 0 1 1 0 1 0 0 1 1 x x x x x P P 0 0 - - - i i i i d d d d d Rd=memubh(Rx++#s4:1)
U
ICLASS Amode Type N e5 Parse d5
1 0 0 1 1 0 1 0 1 0 1 e e e e e P P 0 1 I I I I - I I d d d d d Rdd=memubh(Re=#U6)
U
ICLASS Amode Type N x5 Parse d5
1 0 0 1 1 0 1 0 1 0 1 x x x x x P P 0 0 - - - i i i i d d d d d Rdd=memubh(Rx++#s4:2)
U
ICLASS Amode Type e5 Parse d5
N
1 0 0 1 1 0 1 0 1 1 1 e e e e e P P 0 1 I I I I - I I d d d d d Rdd=membh(Re=#U6)
U
ICLASS Amode Type x5 Parse d5
N
1 0 0 1 1 0 1 0 1 1 1 x x x x x P P 0 0 - - - i i i i d d d d d Rdd=membh(Rx++#s4:2)
U
ICLASS Amode Type t5 Parse d5
N
1 0 0 1 1 1 0 0 0 0 1 t t t t t P P i 1 I I I I i I I d d d d d Rd=membh(Rt<<#u2+#U6)
U
ICLASS Amode Type x5 Parse u1 d5
N
1 0 0 1 1 1 0 0 0 0 1 x x x x x P P u 0 - - - - 0 - - d d d d d Rd=membh(Rx++Mu)
1 0 0 1 1 1 0 0 0 1 1 t t t t t P P i 1 I I I I i I I d d d d d Rd=memubh(Rt<<#u2+#U
6)
U
ICLASS Amode Type x5 Parse u1 d5
N
1 0 0 1 1 1 0 0 0 1 1 x x x x x P P u 0 - - - - 0 - - d d d d d Rd=memubh(Rx++Mu)
U
ICLASS Amode Type t5 Parse d5
N
1 0 0 1 1 1 0 0 1 0 1 t t t t t P P i 1 I I I I i I I d d d d d Rdd=memubh(Rt<<#u2+#
U6)
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
U
ICLASS Amode Type x5 Parse u1 d5
N
1 0 0 1 1 1 0 0 1 0 1 x x x x x P P u 0 - - - - 0 - - d d d d d Rdd=memubh(Rx++Mu)
U
ICLASS Amode Type t5 Parse d5
N
1 0 0 1 1 1 0 0 1 1 1 t t t t t P P i 1 I I I I i I I d d d d d Rdd=membh(Rt<<#u2+#U
6)
U
ICLASS Amode Type x5 Parse u1 d5
N
1 0 0 1 1 1 0 0 1 1 1 x x x x x P P u 0 - - - - 0 - - d d d d d Rdd=membh(Rx++Mu)
1 0 0 1 1 1 1 0 0 0 1 x x x x x P P u 0 - - - - 0 - - d d d d d Rd=membh(Rx++Mu:brev)
Rd=memubh(Rx++Mu:brev
1 0 0 1 1 1 1 0 0 1 1 x x x x x P P u 0 - - - - 0 - - d d d d d
)
Rdd=memubh(Rx++Mu:bre
1 0 0 1 1 1 1 0 1 0 1 x x x x x P P u 0 - - - - 0 - - d d d d d
v)
Rdd=membh(Rx++Mu:brev
1 0 0 1 1 1 1 0 1 1 1 x x x x x P P u 0 - - - - 0 - - d d d d d )
11.6 MEMOP
The MEMOP instruction class includes simple operations on values in memory.
MEMOP instructions are executable on slot 0.
Syntax Behavior
memb(Rs+#u6:0)=clrbit(#U5) apply_extension(#u);
EA=Rs+#u;
tmp = *EA;
tmp &= (~(1<<#U));
*EA = tmp;
memb(Rs+#u6:0)=setbit(#U5) apply_extension(#u);
EA=Rs+#u;
tmp = *EA;
tmp |= (1<<#U);
*EA = tmp;
memb(Rs+#u6:0)[+-]=#U5 apply_extension(#u);
EA=Rs[+-]#u;
tmp = *EA;
tmp [+-]= #U;
*EA = tmp;
memb(Rs+#u6:0)[+-|&]=Rt apply_extension(#u);
EA=Rs+#u;
tmp = *EA;
tmp [+-|&]= Rt;
*EA = tmp;
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse t5
0 0 1 1 1 1 1 0 - 0 0 s s s s s P P 0 i i i i i i 0 0 t t t t t memb(Rs+#u6:0)+=Rt
0 0 1 1 1 1 1 0 - 0 0 s s s s s P P 0 i i i i i i 0 1 t t t t t memb(Rs+#u6:0)-=Rt
0 0 1 1 1 1 1 0 - 0 0 s s s s s P P 0 i i i i i i 1 0 t t t t t memb(Rs+#u6:0)&=Rt
0 0 1 1 1 1 1 0 - 0 0 s s s s s P P 0 i i i i i i 1 1 t t t t t memb(Rs+#u6:0)|=Rt
ICLASS s5 Parse
0 0 1 1 1 1 1 1 - 0 0 s s s s s P P 0 i i i i i i 0 0 I I I I I memb(Rs+#u6:0)+=#U5
0 0 1 1 1 1 1 1 - 0 0 s s s s s P P 0 i i i i i i 0 1 I I I I I memb(Rs+#u6:0)-=#U5
0 0 1 1 1 1 1 1 - 0 0 s s s s s P P 0 i i i i i i 1 0 I I I I I memb(Rs+#u6:0)=clrbit(#U
5)
memb(Rs+#u6:0)=setbit(#
0 0 1 1 1 1 1 1 - 0 0 s s s s s P P 0 i i i i i i 1 1 I I I I I
U5)
Syntax Behavior
memh(Rs+#u6:1)=clrbit(#U5) apply_extension(#u);
EA=Rs+#u;
tmp = *EA;
tmp &= (~(1<<#U));
*EA = tmp;
memh(Rs+#u6:1)=setbit(#U5) apply_extension(#u);
EA=Rs+#u;
tmp = *EA;
tmp |= (1<<#U);
*EA = tmp;
memh(Rs+#u6:1)[+-]=#U5 apply_extension(#u);
EA=Rs[+-]#u;
tmp = *EA;
tmp [+-]= #U;
*EA = tmp;
memh(Rs+#u6:1)[+-|&]=Rt apply_extension(#u);
EA=Rs+#u;
tmp = *EA;
tmp [+-|&]= Rt;
*EA = tmp;
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse t5
0 0 1 1 1 1 1 0 - 0 1 s s s s s P P 0 i i i i i i 0 0 t t t t t memh(Rs+#u6:1)+=Rt
0 0 1 1 1 1 1 0 - 0 1 s s s s s P P 0 i i i i i i 0 1 t t t t t memh(Rs+#u6:1)-=Rt
0 0 1 1 1 1 1 0 - 0 1 s s s s s P P 0 i i i i i i 1 0 t t t t t memh(Rs+#u6:1)&=Rt
0 0 1 1 1 1 1 0 - 0 1 s s s s s P P 0 i i i i i i 1 1 t t t t t memh(Rs+#u6:1)|=Rt
ICLASS s5 Parse
0 0 1 1 1 1 1 1 - 0 1 s s s s s P P 0 i i i i i i 0 0 I I I I I memh(Rs+#u6:1)+=#U5
0 0 1 1 1 1 1 1 - 0 1 s s s s s P P 0 i i i i i i 0 1 I I I I I memh(Rs+#u6:1)-=#U5
memh(Rs+#u6:1)=clrbit(#U
0 0 1 1 1 1 1 1 - 0 1 s s s s s P P 0 i i i i i i 1 0 I I I I I
5)
memh(Rs+#u6:1)=setbit(#
0 0 1 1 1 1 1 1 - 0 1 s s s s s P P 0 i i i i i i 1 1 I I I I I
U5)
Syntax Behavior
memw(Rs+#u6:2)=clrbit(#U5) apply_extension(#u);
EA=Rs+#u;
tmp = *EA;
tmp &= (~(1<<#U));
*EA = tmp;
memw(Rs+#u6:2)=setbit(#U5) apply_extension(#u);
EA=Rs+#u;
tmp = *EA;
tmp |= (1<<#U);
*EA = tmp;
memw(Rs+#u6:2)[+-]=#U5 apply_extension(#u);
EA=Rs[+-]#u;
tmp = *EA;
tmp [+-]= #U;
*EA = tmp;
memw(Rs+#u6:2)[+-|&]=Rt apply_extension(#u);
EA=Rs+#u;
tmp = *EA;
tmp [+-|&]= Rt;
*EA = tmp;
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse t5
0 0 1 1 1 1 1 0 - 1 0 s s s s s P P 0 i i i i i i 0 0 t t t t t memw(Rs+#u6:2)+=Rt
0 0 1 1 1 1 1 0 - 1 0 s s s s s P P 0 i i i i i i 0 1 t t t t t memw(Rs+#u6:2)-=Rt
0 0 1 1 1 1 1 0 - 1 0 s s s s s P P 0 i i i i i i 1 0 t t t t t memw(Rs+#u6:2)&=Rt
0 0 1 1 1 1 1 0 - 1 0 s s s s s P P 0 i i i i i i 1 1 t t t t t memw(Rs+#u6:2)|=Rt
ICLASS s5 Parse
0 0 1 1 1 1 1 1 - 1 0 s s s s s P P 0 i i i i i i 0 0 I I I I I memw(Rs+#u6:2)+=#U5
0 0 1 1 1 1 1 1 - 1 0 s s s s s P P 0 i i i i i i 0 1 I I I I I memw(Rs+#u6:2)-=#U5
memw(Rs+#u6:2)=clrbit(#U
0 0 1 1 1 1 1 1 - 1 0 s s s s s P P 0 i i i i i i 1 0 I I I I I
5)
memw(Rs+#u6:2)=setbit(#
0 0 1 1 1 1 1 1 - 1 0 s s s s s P P 0 i i i i i i 1 1 I I I I I
U5)
11.7 NV
The NV instruction class includes instructions that take the register source operand from another
instruction in the same packet.
NV instructions are executable on slot 0.
11.7.1 NV J
The NV J instruction subclass includes jump instructions that take the register source operand
from another instruction in the same packet.
Syntax Behavior
if ([!]cmp.eq(Ns.new,#-1)) jump:<hint> if ((Ns.new[!]=(-1))) {
#r9:2 apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
PC=PC+#r;
}
if ([!]cmp.eq(Ns.new,#U5)) jump:<hint> if ((Ns.new[!]=(#U))) {
#r9:2 apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
PC=PC+#r;
}
if ([!]cmp.eq(Ns.new,Rt)) jump:<hint> if ((Ns.new[!]=Rt)) {
#r9:2 apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
PC=PC+#r;
}
if ([!]cmp.gt(Ns.new,#-1)) jump:<hint> if ([!](Ns.new>(-1))) {
#r9:2 apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
PC=PC+#r;
}
if ([!]cmp.gt(Ns.new,#U5)) jump:<hint> if ([!](Ns.new>(#U))) {
#r9:2 apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
PC=PC+#r;
}
if ([!]cmp.gt(Ns.new,Rt)) jump:<hint> if ([!](Ns.new>Rt)) {
#r9:2 apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
PC=PC+#r;
}
Syntax Behavior
if ([!]cmp.gt(Rt,Ns.new)) jump:<hint> if ([!](Rt>Ns.new)) {
#r9:2 apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
PC=PC+#r;
}
if ([!]cmp.gtu(Ns.new,#U5)) jump:<hint> if ([!](Ns.new.uw[0]>(#U))) {
#r9:2 apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
PC=PC+#r;
}
if ([!]cmp.gtu(Ns.new,Rt)) jump:<hint> if ([!](Ns.new.uw[0]>Rt.uw[0])) {
#r9:2 apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
PC=PC+#r;
}
if ([!]cmp.gtu(Rt,Ns.new)) jump:<hint> if ([!](Rt.uw[0]>Ns.new.uw[0])) {
#r9:2 apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
PC=PC+#r;
}
if ([!]tstbit(Ns.new,#0)) jump:<hint> if ([!]((Ns.new) & 1)) {
#r9:2 apply_extension(#r);
#r=#r & ~PCALIGN_MASK;
PC=PC+#r;
}
Class: NV (slots 0)
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s3 Parse t5
0 0 1 0 0 0 0 0 0 0 i i - s s s P P 0 t t t t t i i i i i i i - if (cmp.eq(Ns.new,Rt))
jump:nt #r9:2
if (cmp.eq(Ns.new,Rt))
0 0 1 0 0 0 0 0 0 0 i i - s s s P P 1 t t t t t i i i i i i i - jump:t #r9:2
0 0 1 0 0 0 0 0 0 1 i i - s s s P P 0 t t t t t i i i i i i i - if (!cmp.eq(Ns.new,Rt))
jump:nt #r9:2
if (!cmp.eq(Ns.new,Rt))
0 0 1 0 0 0 0 0 0 1 i i - s s s P P 1 t t t t t i i i i i i i -
jump:t #r9:2
if (cmp.gt(Ns.new,Rt))
0 0 1 0 0 0 0 0 1 0 i i - s s s P P 0 t t t t t i i i i i i i - jump:nt #r9:2
if (cmp.gt(Ns.new,Rt))
0 0 1 0 0 0 0 0 1 0 i i - s s s P P 1 t t t t t i i i i i i i -
jump:t #r9:2
if (!cmp.gt(Ns.new,Rt))
0 0 1 0 0 0 0 0 1 1 i i - s s s P P 0 t t t t t i i i i i i i -
jump:nt #r9:2
if (!cmp.gt(Ns.new,Rt))
0 0 1 0 0 0 0 0 1 1 i i - s s s P P 1 t t t t t i i i i i i i -
jump:t #r9:2
if (cmp.gtu(Ns.new,Rt))
0 0 1 0 0 0 0 1 0 0 i i - s s s P P 0 t t t t t i i i i i i i -
jump:nt #r9:2
0 0 1 0 0 0 0 1 0 0 i i - s s s P P 1 t t t t t i i i i i i i - if (cmp.gtu(Ns.new,Rt))
jump:t #r9:2
if (!cmp.gtu(Ns.new,Rt))
0 0 1 0 0 0 0 1 0 1 i i - s s s P P 0 t t t t t i i i i i i i -
jump:nt #r9:2
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
if (!cmp.gtu(Ns.new,Rt))
0 0 1 0 0 0 0 1 0 1 i i - s s s P P 1 t t t t t i i i i i i i -
jump:t #r9:2
0 0 1 0 0 0 0 1 1 0 i i - s s s P P 0 t t t t t i i i i i i i - if (cmp.gt(Rt,Ns.new))
jump:nt #r9:2
if (cmp.gt(Rt,Ns.new))
0 0 1 0 0 0 0 1 1 0 i i - s s s P P 1 t t t t t i i i i i i i -
jump:t #r9:2
if (!cmp.gt(Rt,Ns.new))
0 0 1 0 0 0 0 1 1 1 i i - s s s P P 0 t t t t t i i i i i i i - jump:nt #r9:2
if (!cmp.gt(Rt,Ns.new))
0 0 1 0 0 0 0 1 1 1 i i - s s s P P 1 t t t t t i i i i i i i -
jump:t #r9:2
if (cmp.gtu(Rt,Ns.new))
0 0 1 0 0 0 1 0 0 0 i i - s s s P P 0 t t t t t i i i i i i i -
jump:nt #r9:2
if (cmp.gtu(Rt,Ns.new))
0 0 1 0 0 0 1 0 0 0 i i - s s s P P 1 t t t t t i i i i i i i -
jump:t #r9:2
0 0 1 0 0 0 1 0 0 1 i i - s s s P P 0 t t t t t i i i i i i i - if (!cmp.gtu(Rt,Ns.new))
jump:nt #r9:2
if (!cmp.gtu(Rt,Ns.new))
0 0 1 0 0 0 1 0 0 1 i i - s s s P P 1 t t t t t i i i i i i i -
jump:t #r9:2
ICLASS s3 Parse
0 0 1 0 0 1 0 0 0 0 i i - s s s P P 0 I I I I I i i i i i i i - if (cmp.eq(Ns.new,#U5))
jump:nt #r9:2
if (cmp.eq(Ns.new,#U5))
0 0 1 0 0 1 0 0 0 0 i i - s s s P P 1 I I I I I i i i i i i i -
jump:t #r9:2
0 0 1 0 0 1 0 0 0 1 i i - s s s P P 0 I I I I I i i i i i i i - if (!cmp.eq(Ns.new,#U5))
jump:nt #r9:2
if (!cmp.eq(Ns.new,#U5))
0 0 1 0 0 1 0 0 0 1 i i - s s s P P 1 I I I I I i i i i i i i -
jump:t #r9:2
if (cmp.gt(Ns.new,#U5))
0 0 1 0 0 1 0 0 1 0 i i - s s s P P 0 I I I I I i i i i i i i - jump:nt #r9:2
0 0 1 0 0 1 0 0 1 0 i i - s s s P P 1 I I I I I i i i i i i i - if (cmp.gt(Ns.new,#U5))
jump:t #r9:2
if (!cmp.gt(Ns.new,#U5))
0 0 1 0 0 1 0 0 1 1 i i - s s s P P 0 I I I I I i i i i i i i -
jump:nt #r9:2
if (!cmp.gt(Ns.new,#U5))
0 0 1 0 0 1 0 0 1 1 i i - s s s P P 1 I I I I I i i i i i i i - jump:t #r9:2
if (cmp.gtu(Ns.new,#U5))
0 0 1 0 0 1 0 1 0 0 i i - s s s P P 0 I I I I I i i i i i i i -
jump:nt #r9:2
if (cmp.gtu(Ns.new,#U5))
0 0 1 0 0 1 0 1 0 0 i i - s s s P P 1 I I I I I i i i i i i i -
jump:t #r9:2
if (!cmp.gtu(Ns.new,#U5))
0 0 1 0 0 1 0 1 0 1 i i - s s s P P 0 I I I I I i i i i i i i -
jump:nt #r9:2
if (!cmp.gtu(Ns.new,#U5))
0 0 1 0 0 1 0 1 0 1 i i - s s s P P 1 I I I I I i i i i i i i -
jump:t #r9:2
if (tstbit(Ns.new,#0))
0 0 1 0 0 1 0 1 1 0 i i - s s s P P 0 - - - - - i i i i i i i -
jump:nt #r9:2
if (tstbit(Ns.new,#0)) jump:t
0 0 1 0 0 1 0 1 1 0 i i - s s s P P 1 - - - - - i i i i i i i -
#r9:2
0 0 1 0 0 1 0 1 1 1 i i - s s s P P 0 - - - - - i i i i i i i - if (!tstbit(Ns.new,#0))
jump:nt #r9:2
if (!tstbit(Ns.new,#0)) jump:t
0 0 1 0 0 1 0 1 1 1 i i - s s s P P 1 - - - - - i i i i i i i -
#r9:2
if (cmp.eq(Ns.new,#-1))
0 0 1 0 0 1 1 0 0 0 i i - s s s P P 0 - - - - - i i i i i i i - jump:nt #r9:2
if (cmp.eq(Ns.new,#-1))
0 0 1 0 0 1 1 0 0 0 i i - s s s P P 1 - - - - - i i i i i i i -
jump:t #r9:2
if (!cmp.eq(Ns.new,#-1))
0 0 1 0 0 1 1 0 0 1 i i - s s s P P 0 - - - - - i i i i i i i -
jump:nt #r9:2
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
if (!cmp.eq(Ns.new,#-1))
0 0 1 0 0 1 1 0 0 1 i i - s s s P P 1 - - - - - i i i i i i i -
jump:t #r9:2
0 0 1 0 0 1 1 0 1 0 i i - s s s P P 0 - - - - - i i i i i i i - if (cmp.gt(Ns.new,#-1))
jump:nt #r9:2
if (cmp.gt(Ns.new,#-1))
0 0 1 0 0 1 1 0 1 0 i i - s s s P P 1 - - - - - i i i i i i i -
jump:t #r9:2
if (!cmp.gt(Ns.new,#-1))
0 0 1 0 0 1 1 0 1 1 i i - s s s P P 0 - - - - - i i i i i i i - jump:nt #r9:2
if (!cmp.gt(Ns.new,#-1))
0 0 1 0 0 1 1 0 1 1 i i - s s s P P 1 - - - - - i i i i i i i -
jump:t #r9:2
11.7.2 NV ST
The NV ST instruction subclass includes store instructions which take the register source operand
from another instruction in the same packet.
Syntax Behavior
memb(Re=#U6)=Nt.new apply_extension(#U);
EA=#U;
*EA = Nt.new.b[0];
Re=#U;
memb(Rs+#s11:0)=Nt.new apply_extension(#s);
EA=Rs+#s;
*EA = Nt.new.b[0];
memb(Rs+Ru<<#u2)=Nt.new EA=Rs+(Ru<<#u);
*EA = Nt.new.b[0];
memb(Ru<<#u2+#U6)=Nt.new apply_extension(#U);
EA=#U+(Ru<<#u);
*EA = Nt.new.b[0];
memb(Rx++#s4:0)=Nt.new EA=Rx;
Rx=Rx+#s;
*EA = Nt.new.b[0];
memb(Rx++#s4:0:circ(Mu))=Nt. EA=Rx;
new Rx=Rx=circ_add(Rx,#s,MuV);
*EA = Nt.new.b[0];
memb(Rx++I:circ(Mu))=Nt.new EA=Rx;
Rx=Rx=circ_add(Rx,I<<0,MuV);
*EA = Nt.new.b[0];
memb(Rx++Mu)=Nt.new EA=Rx;
Rx=Rx+MuV;
*EA = Nt.new.b[0];
memb(Rx++Mu:brev)=Nt.new EA=Rx.h[1] | brev(Rx.h[0]);
Rx=Rx+MuV;
*EA = Nt.new.b[0];
memb(gp+#u16:0)=Nt.new apply_extension(#u);
EA=(Constant_extended ? (0) : GP)+#u;
*EA = Nt.new.b[0];
Class: NV (slots 0)
Notes
■ Forms of this instruction that use a new-value operand produced in the packet must execute
on slot 0.
■ This instruction can execute only in slot 0, even though it is an ST instruction.
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse u5 t3
memb(Rs+Ru<<#u2)=Nt.n
0 0 1 1 1 0 1 1 1 0 1 s s s s s P P i u u u u u i - - 0 0 t t t ew
ICLASS Type Parse t3
0 1 0 0 1 i i 0 1 0 1 i i i i i P P i 0 0 t t t i i i i i i i i memb(gp+#u16:0)=Nt.new
U
ICLASS Amode Type s5 Parse t3
N
1 0 1 0 0 i i 1 1 0 1 s s s s s P P i 0 0 t t t i i i i i i i i memb(Rs+#s11:0)=Nt.new
U
ICLASS Amode Type x5 Parse u1 t3
N
memb(Rx++I:circ(Mu))=Nt.
1 0 1 0 1 0 0 1 1 0 1 x x x x x P P u 0 0 t t t 0 - - - - - 1 -
new
1 0 1 0 1 0 0 1 1 0 1 x x x x x P P u 0 0 t t t 0 i i i i - 0 - memb(Rx++#s4:0:circ(Mu))
=Nt.new
U
ICLASS Amode Type e5 Parse t3
N
1 0 1 0 1 0 1 1 1 0 1 e e e e e P P 0 0 0 t t t 1 - I I I I I I memb(Re=#U6)=Nt.new
U
ICLASS Amode Type N x5 Parse t3
1 0 1 0 1 0 1 1 1 0 1 x x x x x P P 0 0 0 t t t 0 i i i i - 0 - memb(Rx++#s4:0)=Nt.new
U
ICLASS Amode Type N
u5 Parse t3
memb(Ru<<#u2+#U6)=Nt.
1 0 1 0 1 1 0 1 1 0 1 u u u u u P P i 0 0 t t t 1 i I I I I I I new
Syntax Behavior
if ([!]Pv[.new]) apply_extension(#u);
memb(#u6)=Nt.new EA=#u;
if ([!]Pv[.new][0]) {
*EA = Nt[.new].b[0];
} else {
NOP;
}
if ([!]Pv[.new]) apply_extension(#u);
memb(Rs+#u6:0)=Nt.new EA=Rs+#u;
if ([!]Pv[.new][0]) {
*EA = Nt[.new].b[0];
} else {
NOP;
}
if ([!]Pv[.new]) EA=Rs+(Ru<<#u);
memb(Rs+Ru<<#u2)=Nt.new if ([!]Pv[.new][0]) {
*EA = Nt[.new].b[0];
} else {
NOP;
}
if ([!]Pv[.new]) EA=Rx;
memb(Rx++#s4:0)=Nt.new if ([!]Pv[.new][0]){
Rx=Rx+#s;
*EA = Nt[.new].b[0];
} else {
NOP;
}
Class: NV (slots 0)
Notes
■ Forms of this instruction which use a new-value operand produced in the packet must
execute on slot 0.
■ This instruction can execute only in slot 0, even though it is an ST instruction.
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse u5 t3
if (Pv)
0 0 1 1 0 1 0 0 1 0 1 s s s s s P P i u u u u u i v v 0 0 t t t memb(Rs+Ru<<#u2)=Nt.n
ew
if (!Pv)
0 0 1 1 0 1 0 1 1 0 1 s s s s s P P i u u u u u i v v 0 0 t t t memb(Rs+Ru<<#u2)=Nt.n
ew
if (Pv.new)
0 0 1 1 0 1 1 0 1 0 1 s s s s s P P i u u u u u i v v 0 0 t t t memb(Rs+Ru<<#u2)=Nt.n
ew
if (!Pv.new)
0 0 1 1 0 1 1 1 1 0 1 s s s s s P P i u u u u u i v v 0 0 t t t memb(Rs+Ru<<#u2)=Nt.n
ew
Pr
Se
ed
ICLASS ns Type s5 Parse t3
Ne
e
w
if (Pv)
0 1 0 0 0 0 0 0 1 0 1 s s s s s P P i 0 0 t t t i i i i i 0 v v memb(Rs+#u6:0)=Nt.new
v if
(Pv.new)
0 1 0 0 0 0 1 0 1 0 1 s s s s s P P i 0 0 t t t i i i i i 0 v memb(Rs+#u6:0)=Nt.new
0 1 0 0 0 1 0 0 1 0 1 s s s s s P P i 0 0 t t t i i i i i 0 v v if (!Pv)
memb(Rs+#u6:0)=Nt.new
if (!Pv.new)
0 1 0 0 0 1 1 0 1 0 1 s s s s s P P i 0 0 t t t i i i i i 0 v v memb(Rs+#u6:0)=Nt.new
1 0 1 0 1 0 1 1 1 0 1 x x x x x P P 1 0 0 t t t 0 i i i i 0 v v if (Pv)
memb(Rx++#s4:0)=Nt.new
if (!Pv)
1 0 1 0 1 0 1 1 1 0 1 x x x x x P P 1 0 0 t t t 0 i i i i 1 v v memb(Rx++#s4:0)=Nt.new
v if
1 0 1 0 1 0 1 1 1 0 1 x x x x x P P 1 0 0 t t t 1 i i i i 0 (Pv.new)
v
memb(Rx++#s4:0)=Nt.new
1 0 1 0 1 0 1 1 1 0 1 x x x x x P P 1 0 0 t t t 1 i i i i 1 v v if (!Pv.new)
memb(Rx++#s4:0)=Nt.new
U
ICLASS Amode Type Parse t3
N
1 0 1 0 1 1 1 1 1 0 1 - - - i i P P 0 0 0 t t t 1 i i i i 0 v v if (Pv) memb(#u6)=Nt.new
1 0 1 0 1 1 1 1 1 0 1 - - - i i P P 0 0 0 t t t 1 i i i i 1 v v if (!Pv) memb(#u6)=Nt.new
if (Pv.new)
1 0 1 0 1 1 1 1 1 0 1 - - - i i P P 1 0 0 t t t 1 i i i i 0 v v memb(#u6)=Nt.new
v ifmemb(#u6)=Nt.new
(!Pv.new)
1 0 1 0 1 1 1 1 1 0 1 - - - i i P P 1 0 0 t t t 1 i i i i 1 v
Syntax Behavior
memh(Re=#U6)=Nt.new apply_extension(#U);
EA=#U;
*EA = Nt.new.h[0];
Re=#U;
memh(Rs+#s11:1)=Nt.new apply_extension(#s);
EA=Rs+#s;
*EA = Nt.new.h[0];
memh(Rs+Ru<<#u2)=Nt.new EA=Rs+(Ru<<#u);
*EA = Nt.new.h[0];
memh(Ru<<#u2+#U6)=Nt.new apply_extension(#U);
EA=#U+(Ru<<#u);
*EA = Nt.new.h[0];
memh(Rx++#s4:1)=Nt.new EA=Rx;
Rx=Rx+#s;
*EA = Nt.new.h[0];
memh(Rx++#s4:1:circ(Mu))=Nt. EA=Rx;
new Rx=Rx=circ_add(Rx,#s,MuV);
*EA = Nt.new.h[0];
memh(Rx++I:circ(Mu))=Nt.new EA=Rx;
Rx=Rx=circ_add(Rx,I<<1,MuV);
*EA = Nt.new.h[0];
memh(Rx++Mu)=Nt.new EA=Rx;
Rx=Rx+MuV;
*EA = Nt.new.h[0];
memh(Rx++Mu:brev)=Nt.new EA=Rx.h[1] | brev(Rx.h[0]);
Rx=Rx+MuV;
*EA = Nt.new.h[0];
memh(gp+#u16:1)=Nt.new apply_extension(#u);
EA=(Constant_extended ? (0) : GP)+#u;
*EA = Nt.new.h[0];
Class: NV (slots 0)
Notes
■ Forms of this instruction that use a new-value operand produced in the packet must execute
on slot 0.
■ This instruction can execute only in slot 0, even though it is an ST instruction.
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse u5 t3
memh(Rs+Ru<<#u2)=Nt.n
0 0 1 1 1 0 1 1 1 0 1 s s s s s P P i u u u u u i - - 0 1 t t t ew
ICLASS Type Parse t3
0 1 0 0 1 i i 0 1 0 1 i i i i i P P i 0 1 t t t i i i i i i i i memh(gp+#u16:1)=Nt.new
U
ICLASS Amode Type s5 Parse t3
N
1 0 1 0 0 i i 1 1 0 1 s s s s s P P i 0 1 t t t i i i i i i i i memh(Rs+#s11:1)=Nt.new
U
ICLASS Amode Type x5 Parse u1 t3
N
memh(Rx++I:circ(Mu))=Nt.
1 0 1 0 1 0 0 1 1 0 1 x x x x x P P u 0 1 t t t 0 - - - - - 1 -
new
1 0 1 0 1 0 0 1 1 0 1 x x x x x P P u 0 1 t t t 0 i i i i - 0 - memh(Rx++#s4:1:circ(Mu))
=Nt.new
U
ICLASS Amode Type e5 Parse t3
N
1 0 1 0 1 0 1 1 1 0 1 e e e e e P P 0 0 1 t t t 1 - I I I I I I memh(Re=#U6)=Nt.new
U
ICLASS Amode Type N x5 Parse t3
1 0 1 0 1 0 1 1 1 0 1 x x x x x P P 0 0 1 t t t 0 i i i i - 0 - memh(Rx++#s4:1)=Nt.new
U
ICLASS Amode Type N
u5 Parse t3
memh(Ru<<#u2+#U6)=Nt.
1 0 1 0 1 1 0 1 1 0 1 u u u u u P P i 0 1 t t t 1 i I I I I I I new
Syntax Behavior
if ([!]Pv[.new]) memh(#u6)=Nt.new apply_extension(#u);
EA=#u;
if ([!]Pv[.new][0]) {
*EA = Nt[.new].h[0];
} else {
NOP;
}
if ([!]Pv[.new]) apply_extension(#u);
memh(Rs+#u6:1)=Nt.new EA=Rs+#u;
if ([!]Pv[.new][0]) {
*EA = Nt[.new].h[0];
} else {
NOP;
}
if ([!]Pv[.new]) EA=Rs+(Ru<<#u);
memh(Rs+Ru<<#u2)=Nt.new if ([!]Pv[.new][0]) {
*EA = Nt[.new].h[0];
} else {
NOP;
}
if ([!]Pv[.new]) EA=Rx;
memh(Rx++#s4:1)=Nt.new if ([!]Pv[.new][0]){
Rx=Rx+#s;
*EA = Nt[.new].h[0];
} else {
NOP;
}
Class: NV (slots 0)
Notes
■ Forms of this instruction that use a new-value operand produced in the packet must execute
on slot 0.
■ This instruction can execute only in slot 0, even though it is an ST instruction.
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse u5 t3
if (Pv)
0 0 1 1 0 1 0 0 1 0 1 s s s s s P P i u u u u u i v v 0 1 t t t memh(Rs+Ru<<#u2)=Nt.n
ew
if (!Pv)
0 0 1 1 0 1 0 1 1 0 1 s s s s s P P i u u u u u i v v 0 1 t t t memh(Rs+Ru<<#u2)=Nt.n
ew
if (Pv.new)
0 0 1 1 0 1 1 0 1 0 1 s s s s s P P i u u u u u i v v 0 1 t t t memh(Rs+Ru<<#u2)=Nt.n
ew
if (!Pv.new)
0 0 1 1 0 1 1 1 1 0 1 s s s s s P P i u u u u u i v v 0 1 t t t memh(Rs+Ru<<#u2)=Nt.n
ew
Pr
Se
ed
ICLASS ns Type s5 Parse t3
Ne
e
w
if (Pv)
0 1 0 0 0 0 0 0 1 0 1 s s s s s P P i 0 1 t t t i i i i i 0 v v memh(Rs+#u6:1)=Nt.new
v if
(Pv.new)
0 1 0 0 0 0 1 0 1 0 1 s s s s s P P i 0 1 t t t i i i i i 0 v memh(Rs+#u6:1)=Nt.new
0 1 0 0 0 1 0 0 1 0 1 s s s s s P P i 0 1 t t t i i i i i 0 v v if (!Pv)
memh(Rs+#u6:1)=Nt.new
if (!Pv.new)
0 1 0 0 0 1 1 0 1 0 1 s s s s s P P i 0 1 t t t i i i i i 0 v v memh(Rs+#u6:1)=Nt.new
1 0 1 0 1 0 1 1 1 0 1 x x x x x P P 1 0 1 t t t 0 i i i i 0 v v if (Pv)
memh(Rx++#s4:1)=Nt.new
if (!Pv)
1 0 1 0 1 0 1 1 1 0 1 x x x x x P P 1 0 1 t t t 0 i i i i 1 v v memh(Rx++#s4:1)=Nt.new
v if
1 0 1 0 1 0 1 1 1 0 1 x x x x x P P 1 0 1 t t t 1 i i i i 0 (Pv.new)
v
memh(Rx++#s4:1)=Nt.new
1 0 1 0 1 0 1 1 1 0 1 x x x x x P P 1 0 1 t t t 1 i i i i 1 v v if (!Pv.new)
memh(Rx++#s4:1)=Nt.new
U
ICLASS Amode Type Parse t3
N
1 0 1 0 1 1 1 1 1 0 1 - - - i i P P 0 0 1 t t t 1 i i i i 0 v v if (Pv) memh(#u6)=Nt.new
1 0 1 0 1 1 1 1 1 0 1 - - - i i P P 0 0 1 t t t 1 i i i i 1 v v if (!Pv) memh(#u6)=Nt.new
if (Pv.new)
1 0 1 0 1 1 1 1 1 0 1 - - - i i P P 1 0 1 t t t 1 i i i i 0 v v memh(#u6)=Nt.new
v ifmemh(#u6)=Nt.new
(!Pv.new)
1 0 1 0 1 1 1 1 1 0 1 - - - i i P P 1 0 1 t t t 1 i i i i 1 v
Class: NV (slots 0)
Notes
■ Forms of this instruction that use a new-value operand produced in the packet must execute
on slot 0.
■ This instruction can execute only in slot 0, even though it is an ST instruction.
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse u5 t3
memw(Rs+Ru<<#u2)=Nt.n
0 0 1 1 1 0 1 1 1 0 1 s s s s s P P i u u u u u i - - 1 0 t t t ew
ICLASS Type Parse t3
0 1 0 0 1 i i 0 1 0 1 i i i i i P P i 1 0 t t t i i i i i i i i memw(gp+#u16:2)=Nt.new
U
ICLASS Amode Type s5 Parse t3
N
1 0 1 0 0 i i 1 1 0 1 s s s s s P P i 1 0 t t t i i i i i i i i memw(Rs+#s11:2)=Nt.new
U
ICLASS Amode Type x5 Parse u1 t3
N
memw(Rx++I:circ(Mu))=Nt.
1 0 1 0 1 0 0 1 1 0 1 x x x x x P P u 1 0 t t t 0 - - - - - 1 -
new
1 0 1 0 1 0 0 1 1 0 1 x x x x x P P u 1 0 t t t 0 i i i i - 0 - memw(Rx++#s4:2:circ(Mu)
)=Nt.new
U
ICLASS Amode Type e5 Parse t3
N
1 0 1 0 1 0 1 1 1 0 1 e e e e e P P 0 1 0 t t t 1 - I I I I I I memw(Re=#U6)=Nt.new
U
ICLASS Amode Type N x5 Parse t3
1 0 1 0 1 0 1 1 1 0 1 x x x x x P P 0 1 0 t t t 0 i i i i - 0 - memw(Rx++#s4:2)=Nt.new
U
ICLASS Amode Type N
u5 Parse t3
memw(Ru<<#u2+#U6)=Nt.
1 0 1 0 1 1 0 1 1 0 1 u u u u u P P i 1 0 t t t 1 i I I I I I I new
Class: NV (slots 0)
Notes
■ Forms of this instruction that use a new-value operand produced in the packet must execute
on slot 0.
■ This instruction can execute only in slot 0, even though it is an ST instruction.
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse u5 t3
if (Pv)
0 0 1 1 0 1 0 0 1 0 1 s s s s s P P i u u u u u i v v 1 0 t t t memw(Rs+Ru<<#u2)=Nt.n
ew
if (!Pv)
0 0 1 1 0 1 0 1 1 0 1 s s s s s P P i u u u u u i v v 1 0 t t t memw(Rs+Ru<<#u2)=Nt.n
ew
if (Pv.new)
0 0 1 1 0 1 1 0 1 0 1 s s s s s P P i u u u u u i v v 1 0 t t t memw(Rs+Ru<<#u2)=Nt.n
ew
if (!Pv.new)
0 0 1 1 0 1 1 1 1 0 1 s s s s s P P i u u u u u i v v 1 0 t t t memw(Rs+Ru<<#u2)=Nt.n
ew
Pr
Se
ed
ICLASS ns Type s5 Parse t3
Ne
e
w
if (Pv)
0 1 0 0 0 0 0 0 1 0 1 s s s s s P P i 1 0 t t t i i i i i 0 v v memw(Rs+#u6:2)=Nt.new
v if
(Pv.new)
0 1 0 0 0 0 1 0 1 0 1 s s s s s P P i 1 0 t t t i i i i i 0 v memw(Rs+#u6:2)=Nt.new
0 1 0 0 0 1 0 0 1 0 1 s s s s s P P i 1 0 t t t i i i i i 0 v v if (!Pv)
memw(Rs+#u6:2)=Nt.new
if (!Pv.new)
0 1 0 0 0 1 1 0 1 0 1 s s s s s P P i 1 0 t t t i i i i i 0 v v memw(Rs+#u6:2)=Nt.new
1 0 1 0 1 0 1 1 1 0 1 x x x x x P P 1 1 0 t t t 0 i i i i 0 v v if (Pv)
memw(Rx++#s4:2)=Nt.new
if (!Pv)
1 0 1 0 1 0 1 1 1 0 1 x x x x x P P 1 1 0 t t t 0 i i i i 1 v v memw(Rx++#s4:2)=Nt.new
v if
1 0 1 0 1 0 1 1 1 0 1 x x x x x P P 1 1 0 t t t 1 i i i i 0 (Pv.new)
v
memw(Rx++#s4:2)=Nt.new
1 0 1 0 1 0 1 1 1 0 1 x x x x x P P 1 1 0 t t t 1 i i i i 1 v v if (!Pv.new)
memw(Rx++#s4:2)=Nt.new
U
ICLASS Amode Type Parse t3
N
1 0 1 0 1 1 1 1 1 0 1 - - - i i P P 0 1 0 t t t 1 i i i i 0 v v if (Pv) memw(#u6)=Nt.new
1 0 1 0 1 1 1 1 1 0 1 - - - i i P P 0 1 0 t t t 1 i i i i 1 v v if (!Pv) memw(#u6)=Nt.new
if (Pv.new)
1 0 1 0 1 1 1 1 1 0 1 - - - i i P P 1 1 0 t t t 1 i i i i 0 v v memw(#u6)=Nt.new
v ifmemw(#u6)=Nt.new
(!Pv.new)
1 0 1 0 1 1 1 1 1 0 1 - - - i i P P 1 1 0 t t t 1 i i i i 1 v
11.8 ST
The ST instruction class includes store instructions, used to store values in memory.
ST instructions are executable on slot 0 and slot 1.
Store doubleword
Store a 64-bit register pair in memory at the effective address.
Syntax Behavior
memd(Re=#U6)=Rtt apply_extension(#U);
EA=#U;
*EA = Rtt;
Re=#U;
memd(Rs+#s11:3)=Rtt apply_extension(#s);
EA=Rs+#s;
*EA = Rtt;
memd(Rs+Ru<<#u2)=Rtt EA=Rs+(Ru<<#u);
*EA = Rtt;
memd(Ru<<#u2+#U6)=Rtt apply_extension(#U);
EA=#U+(Ru<<#u);
*EA = Rtt;
memd(Rx++#s4:3)=Rtt EA=Rx;
Rx=Rx+#s;
*EA = Rtt;
memd(Rx++#s4:3:circ(Mu))=Rtt EA=Rx;
Rx=Rx=circ_add(Rx,#s,MuV);
*EA = Rtt;
memd(Rx++I:circ(Mu))=Rtt EA=Rx;
Rx=Rx=circ_add(Rx,I<<3,MuV);
*EA = Rtt;
memd(Rx++Mu)=Rtt EA=Rx;
Rx=Rx+MuV;
*EA = Rtt;
memd(Rx++Mu:brev)=Rtt EA=Rx.h[1] | brev(Rx.h[0]);
Rx=Rx+MuV;
*EA = Rtt;
memd(gp+#u16:3)=Rtt apply_extension(#u);
EA=(Constant_extended ? (0) : GP)+#u;
*EA = Rtt;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse u5 t5
0 0 1 1 1 0 1 1 1 1 0 s s s s s P P i u u u u u i - - t t t t t memd(Rs+Ru<<#u2)=Rtt
ICLASS Type Parse t5
0 1 0 0 1 i i 0 1 1 0 i i i i i P P i t t t t t i i i i i i i i memd(gp+#u16:3)=Rtt
1 0 1 0 1 0 0 1 1 1 0 x x x x x P P u t t t t t 0 - - - - - 1 - memd(Rx++I:circ(Mu))=Rtt
memd(Rx++#s4:3:circ(Mu))
1 0 1 0 1 0 0 1 1 1 0 x x x x x P P u t t t t t 0 i i i i - 0 - =Rtt
U
ICLASS Amode Type e5 Parse t5
N
1 0 1 0 1 0 1 1 1 1 0 e e e e e P P 0 t t t t t 1 - I I I I I I memd(Re=#U6)=Rtt
Store-release doubleword
Store a 64-bit register pair in memory at the effective address. The store-release memory
operation is observed after all preceding memory operations have been observed at the local
point of serialization. A different order may be observed at the global point of serialization. (see
Ordering and Synchronization).
When the :st (same domain) option is specified, the preceding memory operations are those that
were committed on any thread with the same consistency domain before this instruction was
committed.
When the :at (all threads) option is specified, the preceding memory operations are those that
were committed on any thread before this instruction was committed.
The store release address is limited to certain memory regions. The following are excluded
memory regions: AHB memory space, AXI M2 memory space, Hexagon memory cut-out is
excluded with the exception of addressable TCM and VTCM memory, and memory with the CCCC
types 2, 3, or 4 are excluded. The :st option does not apply to cache operation by index or global
cache operation. The :st option does not apply a consistency domain to vector operations, but
instead uses a per hardware thread ordering scope.
Syntax Behavior
memd_rl(Rs):at=Rtt EA=Rs;
*EA = Rtt
memd_rl(Rs):st=Rtt EA=Rs;
*EA = Rtt
Class: ST (slots 0)
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
U
ICLASS Amode Type s5 Parse t5 d2
N
1 0 1 0 0 0 0 0 1 1 1 s s s s s P P 0 t t t t t - - 0 0 1 0 d d memd_rl(Rs):at=Rtt
1 0 1 0 0 0 0 0 1 1 1 s s s s s P P 0 t t t t t - - 1 0 1 0 d d memd_rl(Rs):st=Rtt
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse u5 t5
if (Pv)
0 0 1 1 0 1 0 0 1 1 0 s s s s s P P i u u u u u i v v t t t t t
memd(Rs+Ru<<#u2)=Rtt
if (!Pv)
0 0 1 1 0 1 0 1 1 1 0 s s s s s P P i u u u u u i v v t t t t t
memd(Rs+Ru<<#u2)=Rtt
if (Pv.new)
0 0 1 1 0 1 1 0 1 1 0 s s s s s P P i u u u u u i v v t t t t t
memd(Rs+Ru<<#u2)=Rtt
if (!Pv.new)
0 0 1 1 0 1 1 1 1 1 0 s s s s s P P i u u u u u i v v t t t t t
memd(Rs+Ru<<#u2)=Rtt
Pr
Se ed
ICLASS ns Type s5 Parse t5
Ne
e w
0 1 0 0 0 0 0 0 1 1 0 s s s s s P P i t t t t t i i i i i 0 v v if (Pv)
memd(Rs+#u6:3)=Rtt
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
if (Pv.new)
0 1 0 0 0 0 1 0 1 1 0 s s s s s P P i t t t t t i i i i i 0 v v
memd(Rs+#u6:3)=Rtt
0 1 0 0 0 1 0 0 1 1 0 s s s s s P P i t t t t t i i i i i 0 v v if (!Pv)
memd(Rs+#u6:3)=Rtt
if (!Pv.new)
0 1 0 0 0 1 1 0 1 1 0 s s s s s P P i t t t t t i i i i i 0 v v
memd(Rs+#u6:3)=Rtt
U
ICLASS Amode Type N x5 Parse t5
if (Pv)
1 0 1 0 1 0 1 1 1 1 0 x x x x x P P 1 t t t t t 0 i i i i 0 v v
memd(Rx++#s4:3)=Rtt
if (!Pv)
1 0 1 0 1 0 1 1 1 1 0 x x x x x P P 1 t t t t t 0 i i i i 1 v v
memd(Rx++#s4:3)=Rtt
if (Pv.new)
1 0 1 0 1 0 1 1 1 1 0 x x x x x P P 1 t t t t t 1 i i i i 0 v v
memd(Rx++#s4:3)=Rtt
1 0 1 0 1 0 1 1 1 1 0 x x x x x P P 1 t t t t t 1 i i i i 1 v v if (!Pv.new)
memd(Rx++#s4:3)=Rtt
U
ICLASS Amode Type Parse t5
N
1 0 1 0 1 1 1 1 1 1 0 - - - i i P P 0 t t t t t 1 i i i i 0 v v if (Pv) memd(#u6)=Rtt
1 0 1 0 1 1 1 1 1 1 0 - - - i i P P 0 t t t t t 1 i i i i 1 v v if (!Pv) memd(#u6)=Rtt
1 0 1 0 1 1 1 1 1 1 0 - - - i i P P 1 t t t t t 1 i i i i 0 v v if (Pv.new) memd(#u6)=Rtt
1 0 1 0 1 1 1 1 1 1 0 - - - i i P P 1 t t t t t 1 i i i i 1 v v if (!Pv.new) memd(#u6)=Rtt
Store byte
Store the least-significant byte in a source register at the effective address.
Syntax Behavior
memb(Re=#U6)=Rt apply_extension(#U);
EA=#U;
*EA = Rt.b[0];
Re=#U;
memb(Rs+#s11:0)=Rt apply_extension(#s);
EA=Rs+#s;
*EA = Rt.b[0];
memb(Rs+#u6:0)=#S8 EA=Rs+#u;
apply_extension(#S);
*EA = #S;
memb(Rs+Ru<<#u2)=Rt EA=Rs+(Ru<<#u);
*EA = Rt.b[0];
memb(Ru<<#u2+#U6)=Rt apply_extension(#U);
EA=#U+(Ru<<#u);
*EA = Rt.b[0];
memb(Rx++#s4:0)=Rt EA=Rx;
Rx=Rx+#s;
*EA = Rt.b[0];
memb(Rx++#s4:0:circ(Mu))=Rt EA=Rx;
Rx=Rx=circ_add(Rx,#s,MuV);
*EA = Rt.b[0];
memb(Rx++I:circ(Mu))=Rt EA=Rx;
Rx=Rx=circ_add(Rx,I<<0,MuV);
*EA = Rt.b[0];
memb(Rx++Mu)=Rt EA=Rx;
Rx=Rx+MuV;
*EA = Rt.b[0];
memb(Rx++Mu:brev)=Rt EA=Rx.h[1] | brev(Rx.h[0]);
Rx=Rx+MuV;
*EA = Rt.b[0];
memb(gp+#u16:0)=Rt apply_extension(#u);
EA=(Constant_extended ? (0) : GP)+#u;
*EA = Rt.b[0];
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse u5 t5
0 0 1 1 1 0 1 1 0 0 0 s s s s s P P i u u u u u i - - t t t t t memb(Rs+Ru<<#u2)=Rt
ICLASS s5 Parse
0 0 1 1 1 1 0 - - 0 0 s s s s s P P I i i i i i i I I I I I I I memb(Rs+#u6:0)=#S8
ICLASS Type Parse t5
0 1 0 0 1 i i 0 0 0 0 i i i i i P P i t t t t t i i i i i i i i memb(gp+#u16:0)=Rt
U
ICLASS Amode Type s5 Parse t5
N
1 0 1 0 0 i i 1 0 0 0 s s s s s P P i t t t t t i i i i i i i i memb(Rs+#s11:0)=Rt
U
ICLASS Amode Type x5 Parse u1 t5
N
1 0 1 0 1 0 0 1 0 0 0 x x x x x P P u t t t t t 0 - - - - - 1 - memb(Rx++I:circ(Mu))=Rt
memb(Rx++#s4:0:circ(Mu))
1 0 1 0 1 0 0 1 0 0 0 x x x x x P P u t t t t t 0 i i i i - 0 -
=Rt
U
ICLASS Amode Type e5 Parse t5
N
1 0 1 0 1 0 1 1 0 0 0 e e e e e P P 0 t t t t t 1 - I I I I I I memb(Re=#U6)=Rt
U
ICLASS Amode Type x5 Parse t5
N
1 0 1 0 1 0 1 1 0 0 0 x x x x x P P 0 t t t t t 0 i i i i - 0 - memb(Rx++#s4:0)=Rt
U
ICLASS Amode Type u5 Parse t5
N
1 0 1 0 1 1 0 1 0 0 0 u u u u u P P i t t t t t 1 i I I I I I I memb(Ru<<#u2+#U6)=Rt
U
ICLASS Amode Type x5 Parse u1 t5
N
1 0 1 0 1 1 0 1 0 0 0 x x x x x P P u t t t t t 0 - - - - - - - memb(Rx++Mu)=Rt
1 0 1 0 1 1 1 1 0 0 0 x x x x x P P u t t t t t 0 - - - - - - - memb(Rx++Mu:brev)=Rt
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse u5 t5
if (Pv)
0 0 1 1 0 1 0 0 0 0 0 s s s s s P P i u u u u u i v v t t t t t memb(Rs+Ru<<#u2)=Rt
if (!Pv)
0 0 1 1 0 1 0 1 0 0 0 s s s s s P P i u u u u u i v v t t t t t
memb(Rs+Ru<<#u2)=Rt
0 0 1 1 0 1 1 0 0 0 0 s s s s s P P i u u u u u i v v t t t t t if (Pv.new)
memb(Rs+Ru<<#u2)=Rt
if (!Pv.new)
0 0 1 1 0 1 1 1 0 0 0 s s s s s P P i u u u u u i v v t t t t t
memb(Rs+Ru<<#u2)=Rt
ICLASS s5 Parse
if (Pv)
0 0 1 1 1 0 0 0 0 0 0 s s s s s P P I i i i i i i v v I I I I I
memb(Rs+#u6:0)=#S6
if (!Pv)
0 0 1 1 1 0 0 0 1 0 0 s s s s s P P I i i i i i i v v I I I I I
memb(Rs+#u6:0)=#S6
if (Pv.new)
0 0 1 1 1 0 0 1 0 0 0 s s s s s P P I i i i i i i v v I I I I I
memb(Rs+#u6:0)=#S6
if (!Pv.new)
0 0 1 1 1 0 0 1 1 0 0 s s s s s P P I i i i i i i v v I I I I I
memb(Rs+#u6:0)=#S6
Pr
Se ed
ICLASS ns Ne Type s5 Parse t5
e
w
0 1 0 0 0 0 0 0 0 0 0 s s s s s P P i t t t t t i i i i i 0 v v if (Pv) memb(Rs+#u6:0)=Rt
0 1 0 0 0 0 1 0 0 0 0 s s s s s P P i t t t t t i i i i i 0 v v if (Pv.new)
memb(Rs+#u6:0)=Rt
if (!Pv)
0 1 0 0 0 1 0 0 0 0 0 s s s s s P P i t t t t t i i i i i 0 v v memb(Rs+#u6:0)=Rt
0 1 0 0 0 1 1 0 0 0 0 s s s s s P P i t t t t t i i i i i 0 v v ifmemb(Rs+#u6:0)=Rt
(!Pv.new)
U
ICLASS Amode Type x5 Parse t5
N
if (Pv)
1 0 1 0 1 0 1 1 0 0 0 x x x x x P P 1 t t t t t 0 i i i i 0 v v memb(Rx++#s4:0)=Rt
1 0 1 0 1 0 1 1 0 0 0 x x x x x P P 1 t t t t t 0 i i i i 1 v v if (!Pv)
memb(Rx++#s4:0)=Rt
if (Pv.new)
1 0 1 0 1 0 1 1 0 0 0 x x x x x P P 1 t t t t t 1 i i i i 0 v v memb(Rx++#s4:0)=Rt
v ifmemb(Rx++#s4:0)=Rt
(!Pv.new)
1 0 1 0 1 0 1 1 0 0 0 x x x x x P P 1 t t t t t 1 i i i i 1 v
U
ICLASS Amode Type Parse t5
N
1 0 1 0 1 1 1 1 0 0 0 - - - i i P P 0 t t t t t 1 i i i i 0 v v if (Pv) memb(#u6)=Rt
1 0 1 0 1 1 1 1 0 0 0 - - - i i P P 0 t t t t t 1 i i i i 1 v v if (!Pv) memb(#u6)=Rt
1 0 1 0 1 1 1 1 0 0 0 - - - i i P P 1 t t t t t 1 i i i i 0 v v if (Pv.new) memb(#u6)=Rt
1 0 1 0 1 1 1 1 0 0 0 - - - i i P P 1 t t t t t 1 i i i i 1 v v if (!Pv.new) memb(#u6)=Rt
Store halfword
Store the upper or lower 16-bits of a source register at the effective address.
Syntax Behavior
memh(Re=#U6)=Rt.H apply_extension(#U);
EA=#U;
*EA = Rt.h[1];
Re=#U;
memh(Re=#U6)=Rt apply_extension(#U);
EA=#U;
*EA = Rt.h[0];
Re=#U;
memh(Rs+#s11:1)=Rt.H apply_extension(#s);
EA=Rs+#s;
*EA = Rt.h[1];
memh(Rs+#s11:1)=Rt apply_extension(#s);
EA=Rs+#s;
*EA = Rt.h[0];
memh(Rs+#u6:1)=#S8 EA=Rs+#u;
apply_extension(#S);
*EA = #S;
memh(Rs+Ru<<#u2)=Rt.H EA=Rs+(Ru<<#u);
*EA = Rt.h[1];
memh(Rs+Ru<<#u2)=Rt EA=Rs+(Ru<<#u);
*EA = Rt.h[0];
memh(Ru<<#u2+#U6)=Rt.H apply_extension(#U);
EA=#U+(Ru<<#u);
*EA = Rt.h[1];
memh(Ru<<#u2+#U6)=Rt apply_extension(#U);
EA=#U+(Ru<<#u);
*EA = Rt.h[0];
memh(Rx++#s4:1)=Rt.H EA=Rx;
Rx=Rx+#s;
*EA = Rt.h[1];
memh(Rx++#s4:1)=Rt EA=Rx;
Rx=Rx+#s;
*EA = Rt.h[0];
memh(Rx++#s4:1:circ(Mu))=Rt. EA=Rx;
H Rx=Rx=circ_add(Rx,#s,MuV);
*EA = Rt.h[1];
memh(Rx++#s4:1:circ(Mu))=Rt EA=Rx;
Rx=Rx=circ_add(Rx,#s,MuV);
*EA = Rt.h[0];
memh(Rx++I:circ(Mu))=Rt.H EA=Rx;
Rx=Rx=circ_add(Rx,I<<1,MuV);
*EA = Rt.h[1];
memh(Rx++I:circ(Mu))=Rt EA=Rx;
Rx=Rx=circ_add(Rx,I<<1,MuV);
*EA = Rt.h[0];
Syntax Behavior
memh(Rx++Mu)=Rt.H EA=Rx;
Rx=Rx+MuV;
*EA = Rt.h[1];
memh(Rx++Mu)=Rt EA=Rx;
Rx=Rx+MuV;
*EA = Rt.h[0];
memh(Rx++Mu:brev)=Rt.H EA=Rx.h[1] | brev(Rx.h[0]);
Rx=Rx+MuV;
*EA = Rt.h[1];
memh(Rx++Mu:brev)=Rt EA=Rx.h[1] | brev(Rx.h[0]);
Rx=Rx+MuV;
*EA = Rt.h[0];
memh(gp+#u16:1)=Rt.H apply_extension(#u);
EA=(Constant_extended ? (0) : GP)+#u;
*EA = Rt.h[1];
memh(gp+#u16:1)=Rt apply_extension(#u);
EA=(Constant_extended ? (0) : GP)+#u;
*EA = Rt.h[0];
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse u5 t5
0 0 1 1 1 0 1 1 0 1 0 s s s s s P P i u u u u u i - - t t t t t memh(Rs+Ru<<#u2)=Rt
0 0 1 1 1 0 1 1 0 1 1 s s s s s P P i u u u u u i - - t t t t t memh(Rs+Ru<<#u2)=Rt.H
ICLASS s5 Parse
0 0 1 1 1 1 0 - - 0 1 s s s s s P P I i i i i i i I I I I I I I memh(Rs+#u6:1)=#S8
ICLASS Type Parse t5
0 1 0 0 1 i i 0 0 1 0 i i i i i P P i t t t t t i i i i i i i i memh(gp+#u16:1)=Rt
0 1 0 0 1 i i 0 0 1 1 i i i i i P P i t t t t t i i i i i i i i memh(gp+#u16:1)=Rt.H
U
ICLASS Amode Type s5 Parse t5
N
1 0 1 0 0 i i 1 0 1 0 s s s s s P P i t t t t t i i i i i i i i memh(Rs+#s11:1)=Rt
1 0 1 0 0 i i 1 0 1 1 s s s s s P P i t t t t t i i i i i i i i memh(Rs+#s11:1)=Rt.H
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
U
ICLASS Amode Type x5 Parse u1 t5
N
1 0 1 0 1 0 0 1 0 1 0 x x x x x P P u t t t t t 0 - - - - - 1 - memh(Rx++I:circ(Mu))=Rt
memh(Rx++#s4:1:circ(Mu))
1 0 1 0 1 0 0 1 0 1 0 x x x x x P P u t t t t t 0 i i i i - 0 -
=Rt
1 0 1 0 1 0 0 1 0 1 1 x x x x x P P u t t t t t 0 - - - - - 1 - memh(Rx++I:circ(Mu))=Rt.
H
memh(Rx++#s4:1:circ(Mu))
1 0 1 0 1 0 0 1 0 1 1 x x x x x P P u t t t t t 0 i i i i - 0 -
=Rt.H
U
ICLASS Amode Type N e5 Parse t5
1 0 1 0 1 0 1 1 0 1 0 e e e e e P P 0 t t t t t 1 - I I I I I I memh(Re=#U6)=Rt
U
ICLASS Amode Type x5 Parse t5
N
1 0 1 0 1 0 1 1 0 1 0 x x x x x P P 0 t t t t t 0 i i i i - 0 - memh(Rx++#s4:1)=Rt
U
ICLASS Amode Type e5 Parse t5
N
1 0 1 0 1 0 1 1 0 1 1 e e e e e P P 0 t t t t t 1 - I I I I I I memh(Re=#U6)=Rt.H
U
ICLASS Amode Type x5 Parse t5
N
1 0 1 0 1 0 1 1 0 1 1 x x x x x P P 0 t t t t t 0 i i i i - 0 - memh(Rx++#s4:1)=Rt.H
U
ICLASS Amode Type u5 Parse t5
N
1 0 1 0 1 1 0 1 0 1 0 u u u u u P P i t t t t t 1 i I I I I I I memh(Ru<<#u2+#U6)=Rt
U
ICLASS Amode Type N x5 Parse u1 t5
1 0 1 0 1 1 0 1 0 1 0 x x x x x P P u t t t t t 0 - - - - - - - memh(Rx++Mu)=Rt
U
ICLASS Amode Type u5 Parse t5
N
memh(Ru<<#u2+#U6)=Rt.
1 0 1 0 1 1 0 1 0 1 1 u u u u u P P i t t t t t 1 i I I I I I I
H
U
ICLASS Amode Type x5 Parse u1 t5
N
1 0 1 0 1 1 0 1 0 1 1 x x x x x P P u t t t t t 0 - - - - - - - memh(Rx++Mu)=Rt.H
1 0 1 0 1 1 1 1 0 1 0 x x x x x P P u t t t t t 0 - - - - - - - memh(Rx++Mu:brev)=Rt
1 0 1 0 1 1 1 1 0 1 1 x x x x x P P u t t t t t 0 - - - - - - - memh(Rx++Mu:brev)=Rt.H
Syntax Behavior
if ([!]Pv[.new]) memh(Rs+Ru<<#u2)=Rt EA=Rs+(Ru<<#u);
if ([!]Pv[.new][0]) {
*EA = Rt.h[0];
} else {
NOP;
}
if ([!]Pv[.new]) memh(Rx++#s4:1)=Rt.H EA=Rx;
if ([!]Pv[.new][0]){
Rx=Rx+#s;
*EA = Rt.h[1];
} else {
NOP;
}
if ([!]Pv[.new]) memh(Rx++#s4:1)=Rt EA=Rx;
if ([!]Pv[.new][0]){
Rx=Rx+#s;
*EA = Rt.h[0];
} else {
NOP;
}
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse u5 t5
0 0 1 1 0 1 0 0 0 1 0 s s s s s P P i u u u u u i v v t t t t t if (Pv)
memh(Rs+Ru<<#u2)=Rt
if (Pv)
0 0 1 1 0 1 0 0 0 1 1 s s s s s P P i u u u u u i v v t t t t t memh(Rs+Ru<<#u2)=Rt.H
0 0 1 1 0 1 0 1 0 1 0 s s s s s P P i u u u u u i v v t t t t t if (!Pv)
memh(Rs+Ru<<#u2)=Rt
if (!Pv)
0 0 1 1 0 1 0 1 0 1 1 s s s s s P P i u u u u u i v v t t t t t
memh(Rs+Ru<<#u2)=Rt.H
if (Pv.new)
0 0 1 1 0 1 1 0 0 1 0 s s s s s P P i u u u u u i v v t t t t t
memh(Rs+Ru<<#u2)=Rt
if (Pv.new)
0 0 1 1 0 1 1 0 0 1 1 s s s s s P P i u u u u u i v v t t t t t
memh(Rs+Ru<<#u2)=Rt.H
if (!Pv.new)
0 0 1 1 0 1 1 1 0 1 0 s s s s s P P i u u u u u i v v t t t t t
memh(Rs+Ru<<#u2)=Rt
if (!Pv.new)
0 0 1 1 0 1 1 1 0 1 1 s s s s s P P i u u u u u i v v t t t t t
memh(Rs+Ru<<#u2)=Rt.H
ICLASS s5 Parse
if (Pv)
0 0 1 1 1 0 0 0 0 0 1 s s s s s P P I i i i i i i v v I I I I I
memh(Rs+#u6:1)=#S6
if (!Pv)
0 0 1 1 1 0 0 0 1 0 1 s s s s s P P I i i i i i i v v I I I I I
memh(Rs+#u6:1)=#S6
if (Pv.new)
0 0 1 1 1 0 0 1 0 0 1 s s s s s P P I i i i i i i v v I I I I I
memh(Rs+#u6:1)=#S6
if (!Pv.new)
0 0 1 1 1 0 0 1 1 0 1 s s s s s P P I i i i i i i v v I I I I I
memh(Rs+#u6:1)=#S6
Pr
Se
ed
ICLASS ns Type s5 Parse t5
Ne
e
w
0 1 0 0 0 0 0 0 0 1 0 s s s s s P P i t t t t t i i i i i 0 v v if (Pv) memh(Rs+#u6:1)=Rt
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
if (Pv)
0 1 0 0 0 0 0 0 0 1 1 s s s s s P P i t t t t t i i i i i 0 v v
memh(Rs+#u6:1)=Rt.H
0 1 0 0 0 0 1 0 0 1 0 s s s s s P P i t t t t t i i i i i 0 v v if (Pv.new)
memh(Rs+#u6:1)=Rt
if (Pv.new)
0 1 0 0 0 0 1 0 0 1 1 s s s s s P P i t t t t t i i i i i 0 v v
memh(Rs+#u6:1)=Rt.H
if (!Pv)
0 1 0 0 0 1 0 0 0 1 0 s s s s s P P i t t t t t i i i i i 0 v v memh(Rs+#u6:1)=Rt
if (!Pv)
0 1 0 0 0 1 0 0 0 1 1 s s s s s P P i t t t t t i i i i i 0 v v
memh(Rs+#u6:1)=Rt.H
if (!Pv.new)
0 1 0 0 0 1 1 0 0 1 0 s s s s s P P i t t t t t i i i i i 0 v v
memh(Rs+#u6:1)=Rt
if (!Pv.new)
0 1 0 0 0 1 1 0 0 1 1 s s s s s P P i t t t t t i i i i i 0 v v
memh(Rs+#u6:1)=Rt.H
v if
(Pv.new)
1 0 1 0 1 0 1 1 0 1 0 x x x x x P P 1 t t t t t 1 i i i i 0 v
memh(Rx++#s4:1)=Rt
1 0 1 0 1 0 1 1 0 1 0 x x x x x P P 1 t t t t t 1 i i i i 1 v v if (!Pv.new)
memh(Rx++#s4:1)=Rt
if (Pv)
1 0 1 0 1 0 1 1 0 1 1 x x x x x P P 1 t t t t t 0 i i i i 0 v v memh(Rx++#s4:1)=Rt.H
v if
(!Pv)
1 0 1 0 1 0 1 1 0 1 1 x x x x x P P 1 t t t t t 0 i i i i 1 v
memh(Rx++#s4:1)=Rt.H
1 0 1 0 1 0 1 1 0 1 1 x x x x x P P 1 t t t t t 1 i i i i 0 v v if (Pv.new)
memh(Rx++#s4:1)=Rt.H
if (!Pv.new)
1 0 1 0 1 0 1 1 0 1 1 x x x x x P P 1 t t t t t 1 i i i i 1 v v memh(Rx++#s4:1)=Rt.H
U
ICLASS Amode Type Parse t5
N
1 0 1 0 1 1 1 1 0 1 0 - - - i i P P 0 t t t t t 1 i i i i 0 v v if (Pv) memh(#u6)=Rt
1 0 1 0 1 1 1 1 0 1 0 - - - i i P P 0 t t t t t 1 i i i i 1 v v if (!Pv) memh(#u6)=Rt
1 0 1 0 1 1 1 1 0 1 0 - - - i i P P 1 t t t t t 1 i i i i 0 v v if (Pv.new) memh(#u6)=Rt
1 0 1 0 1 1 1 1 0 1 0 - - - i i P P 1 t t t t t 1 i i i i 1 v v if (!Pv.new) memh(#u6)=Rt
1 0 1 0 1 1 1 1 0 1 1 - - - i i P P 0 t t t t t 1 i i i i 0 v v if (Pv) memh(#u6)=Rt.H
1 0 1 0 1 1 1 1 0 1 1 - - - i i P P 0 t t t t t 1 i i i i 1 v v if (!Pv) memh(#u6)=Rt.H
if (Pv.new)
1 0 1 0 1 1 1 1 0 1 1 - - - i i P P 1 t t t t t 1 i i i i 0 v v memh(#u6)=Rt.H
if (!Pv.new)
1 0 1 0 1 1 1 1 0 1 1 - - - i i P P 1 t t t t t 1 i i i i 1 v v memh(#u6)=Rt.H
Release
The release memory operation is observed after all preceding memory operations have been
observed at the local point of serialization. A different order can be observed at the global point
of serialization (see Ordering and Synchronization). No data is modified by this instruction.
When the :st (same domain) option is specified, the preceding memory operations are those that
were committed on any thread with the same consistency domain before this instruction was
committed.
When the :at (all threads) option is specified, the preceding memory operations are those that
were committed on any thread before this instruction was committed.
The store release address is limited to certain memory regions. The following memory regions are
excluded:
■ AHB memory space
■ AXI M2 memory space
■ Hexagon memory cut-out is excluded with the exception of addressable TCM and VTCM
memory
■ Memory with the CCCC types 2, 3, or 4
The :st option does not apply to cache operation by index or global cache operation. The :st
option does not apply a consistency domain to vector operations, but instead uses a per
hardware thread ordering scope.
Syntax Behavior
release(Rs):at EA=Rs;
*EA = Rs
release(Rs):st EA=Rs;
*EA = Rs
Class: ST (slots 0)
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Store word
Store a 32-bit register in memory at the effective address.
Syntax Behavior
memw(Re=#U6)=Rt apply_extension(#U);
EA=#U;
*EA = Rt;
Re=#U;
memw(Rs+#s11:2)=Rt apply_extension(#s);
EA=Rs+#s;
*EA = Rt;
memw(Rs+#u6:2)=#S8 EA=Rs+#u;
apply_extension(#S);
*EA = #S;
memw(Rs+Ru<<#u2)=Rt EA=Rs+(Ru<<#u);
*EA = Rt;
memw(Ru<<#u2+#U6)=Rt apply_extension(#U);
EA=#U+(Ru<<#u);
*EA = Rt;
memw(Rx++#s4:2)=Rt EA=Rx;
Rx=Rx+#s;
*EA = Rt;
memw(Rx++#s4:2:circ(Mu))=Rt EA=Rx;
Rx=Rx=circ_add(Rx,#s,MuV);
*EA = Rt;
memw(Rx++I:circ(Mu))=Rt EA=Rx;
Rx=Rx=circ_add(Rx,I<<2,MuV);
*EA = Rt;
memw(Rx++Mu)=Rt EA=Rx;
Rx=Rx+MuV;
*EA = Rt;
memw(Rx++Mu:brev)=Rt EA=Rx.h[1] | brev(Rx.h[0]);
Rx=Rx+MuV;
*EA = Rt;
memw(gp+#u16:2)=Rt apply_extension(#u);
EA=(Constant_extended ? (0) : GP)+#u;
*EA = Rt;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse u5 t5
0 0 1 1 1 0 1 1 1 0 0 s s s s s P P i u u u u u i - - t t t t t memw(Rs+Ru<<#u2)=Rt
ICLASS s5 Parse
0 0 1 1 1 1 0 - - 1 0 s s s s s P P I i i i i i i I I I I I I I memw(Rs+#u6:2)=#S8
ICLASS Type Parse t5
0 1 0 0 1 i i 0 1 0 0 i i i i i P P i t t t t t i i i i i i i i memw(gp+#u16:2)=Rt
U
ICLASS Amode Type s5 Parse t5
N
1 0 1 0 0 i i 1 1 0 0 s s s s s P P i t t t t t i i i i i i i i memw(Rs+#s11:2)=Rt
U
ICLASS Amode Type x5 Parse u1 t5
N
1 0 1 0 1 0 0 1 1 0 0 x x x x x P P u t t t t t 0 - - - - - 1 - memw(Rx++I:circ(Mu))=Rt
memw(Rx++#s4:2:circ(Mu)
1 0 1 0 1 0 0 1 1 0 0 x x x x x P P u t t t t t 0 i i i i - 0 -
)=Rt
U
ICLASS Amode Type e5 Parse t5
N
1 0 1 0 1 0 1 1 1 0 0 e e e e e P P 0 t t t t t 1 - I I I I I I memw(Re=#U6)=Rt
U
ICLASS Amode Type x5 Parse t5
N
1 0 1 0 1 0 1 1 1 0 0 x x x x x P P 0 t t t t t 0 i i i i - 0 - memw(Rx++#s4:2)=Rt
U
ICLASS Amode Type u5 Parse t5
N
1 0 1 0 1 1 0 1 1 0 0 u u u u u P P i t t t t t 1 i I I I I I I memw(Ru<<#u2+#U6)=Rt
U
ICLASS Amode Type x5 Parse u1 t5
N
1 0 1 0 1 1 0 1 1 0 0 x x x x x P P u t t t t t 0 - - - - - - - memw(Rx++Mu)=Rt
1 0 1 0 1 1 1 1 1 0 0 x x x x x P P u t t t t t 0 - - - - - - - memw(Rx++Mu:brev)=Rt
Store-release word
Store a 32-bit register in memory at the effective address. The store-release memory operation is
observed after all preceding memory operations have been observed at the local point of
serialization. A different order can be observed at the global point of serialization (see Ordering
and Synchronization).
When the :st (same domain) option is specified, the preceding memory operations are those that
were committed on any thread with the same consistency domain before this instruction was
committed.
When the :at (all threads) option is specified, the preceding memory operations are those that
were committed on any thread before this instruction was committed.
The store release address is limited to certain memory regions. The following are excluded
memory regions: AHB memory space, AXI M2 memory space, Hexagon memory cut-out is
excluded with the exception of addressable TCM and VTCM memory, and memory with the CCCC
types 2, 3, or 4 are excluded. The :st option does not apply to cache operation by index or global
cache operation. The :st option does not apply a consistency domain to vector operations, but
instead uses a per hardware thread ordering scope.
Syntax Behavior
memw_rl(Rs):at=Rt EA=Rs;
*EA = Rt
memw_rl(Rs):st=Rt EA=Rs;
*EA = Rt
Class: ST (slots 0)
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
U
ICLASS Amode Type s5 Parse t5 d2
N
1 0 1 0 0 0 0 0 1 0 1 s s s s s P P - t t t t t - - 0 0 1 0 d d memw_rl(Rs):at=Rt
1 0 1 0 0 0 0 0 1 0 1 s s s s s P P - t t t t t - - 1 0 1 0 d d memw_rl(Rs):st=Rt
Syntax Behavior
if ([!]Pv[.new]) memw(#u6)=Rt apply_extension(#u);
EA=#u;
if ([!]Pv[.new][0]) {
*EA = Rt;
} else {
NOP;
}
if ([!]Pv[.new]) memw(Rs+#u6:2)=#S6 EA=Rs+#u;
if ([!]Pv[.new][0]){
apply_extension(#S);
*EA = #S;
} else {
NOP;
}
if ([!]Pv[.new]) memw(Rs+#u6:2)=Rt apply_extension(#u);
EA=Rs+#u;
if ([!]Pv[.new][0]) {
*EA = Rt;
} else {
NOP;
}
if ([!]Pv[.new]) memw(Rs+Ru<<#u2)=Rt EA=Rs+(Ru<<#u);
if ([!]Pv[.new][0]) {
*EA = Rt;
} else {
NOP;
}
if ([!]Pv[.new]) memw(Rx++#s4:2)=Rt EA=Rx;
if ([!]Pv[.new][0]){
Rx=Rx+#s;
*EA = Rt;
} else {
NOP;
}
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse u5 t5
if (Pv)
0 0 1 1 0 1 0 0 1 0 0 s s s s s P P i u u u u u i v v t t t t t memw(Rs+Ru<<#u2)=Rt
if (!Pv)
0 0 1 1 0 1 0 1 1 0 0 s s s s s P P i u u u u u i v v t t t t t
memw(Rs+Ru<<#u2)=Rt
0 0 1 1 0 1 1 0 1 0 0 s s s s s P P i u u u u u i v v t t t t t if (Pv.new)
memw(Rs+Ru<<#u2)=Rt
if (!Pv.new)
0 0 1 1 0 1 1 1 1 0 0 s s s s s P P i u u u u u i v v t t t t t
memw(Rs+Ru<<#u2)=Rt
ICLASS s5 Parse
if (Pv)
0 0 1 1 1 0 0 0 0 1 0 s s s s s P P I i i i i i i v v I I I I I
memw(Rs+#u6:2)=#S6
if (!Pv)
0 0 1 1 1 0 0 0 1 1 0 s s s s s P P I i i i i i i v v I I I I I
memw(Rs+#u6:2)=#S6
if (Pv.new)
0 0 1 1 1 0 0 1 0 1 0 s s s s s P P I i i i i i i v v I I I I I
memw(Rs+#u6:2)=#S6
if (!Pv.new)
0 0 1 1 1 0 0 1 1 1 0 s s s s s P P I i i i i i i v v I I I I I
memw(Rs+#u6:2)=#S6
Pr
Se ed
ICLASS ns Ne Type s5 Parse t5
e
w
if (Pv)
0 1 0 0 0 0 0 0 1 0 0 s s s s s P P i t t t t t i i i i i 0 v v memw(Rs+#u6:2)=Rt
v if
(Pv.new)
0 1 0 0 0 0 1 0 1 0 0 s s s s s P P i t t t t t i i i i i 0 v
memw(Rs+#u6:2)=Rt
0 1 0 0 0 1 0 0 1 0 0 s s s s s P P i t t t t t i i i i i 0 v v if (!Pv)
memw(Rs+#u6:2)=Rt
if (!Pv.new)
0 1 0 0 0 1 1 0 1 0 0 s s s s s P P i t t t t t i i i i i 0 v v memw(Rs+#u6:2)=Rt
1 0 1 0 1 0 1 1 1 0 0 x x x x x P P 1 t t t t t 0 i i i i 0 v v if (Pv)
memw(Rx++#s4:2)=Rt
if (!Pv)
1 0 1 0 1 0 1 1 1 0 0 x x x x x P P 1 t t t t t 0 i i i i 1 v v memw(Rx++#s4:2)=Rt
1 0 1 0 1 0 1 1 1 0 0 x x x x x P P 1 t t t t t 1 i i i i 0 v v if (Pv.new)
memw(Rx++#s4:2)=Rt
if (!Pv.new)
1 0 1 0 1 0 1 1 1 0 0 x x x x x P P 1 t t t t t 1 i i i i 1 v v memw(Rx++#s4:2)=Rt
U
ICLASS Amode Type Parse t5
N
1 0 1 0 1 1 1 1 1 0 0 - - - i i P P 0 t t t t t 1 i i i i 0 v v if (Pv) memw(#u6)=Rt
1 0 1 0 1 1 1 1 1 0 0 - - - i i P P 0 t t t t t 1 i i i i 1 v v if (!Pv) memw(#u6)=Rt
1 0 1 0 1 1 1 1 1 0 0 - - - i i P P 1 t t t t t 1 i i i i 0 v v if (Pv.new) memw(#u6)=Rt
1 0 1 0 1 1 1 1 1 0 0 - - - i i P P 1 t t t t t 1 i i i i 1 v v if (!Pv.new) memw(#u6)=Rt
Saved LR
Saved FP
Higher address
Procedure local
data on stack
Stack frame
Saved LR
Saved FP FP register
Procedure local
data on stack
SP register
Lower address
Unallocated stack
Syntax Behavior
allocframe(#u11:3) Assembler mapped to:
"allocframe(r29,#u11:3):raw"
allocframe(Rx,#u11:3):raw EA=Rx+-8;
*EA = frame_scramble((LR << 32) | FP);
FP=EA;
frame_check_limit(EA-#u);
Rx = EA-#u;
Class: ST (slots 0)
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
U
ICLASS Amode Type x5 Parse
N
1 0 1 0 0 0 0 0 1 0 0 x x x x x P P 0 0 0 i i i i i i i i i i i allocframe(Rx,#u11:3):raw
11.9 SYSTEM
The SYSTEM instruction class includes instructions for managing system resources.
Load locked
This memory lock instruction performs a word or double-word locked load.
This instruction returns the contents of the memory at address Rs and also reserves a lock
reservation at that address. For more information, see Atomic operations.
Syntax Behavior
Rd=memw_locked(Rs) EA=Rs;
Rd = *EA;
Rdd=memd_locked(Rs) EA=Rs;
Rdd = *EA;
Notes
■ This instruction is only grouped with ALU32 or nonfloating-point XTYPE instructions.
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
U
ICLASS Amode Type s5 Parse d5
N
1 0 0 1 0 0 1 0 0 0 0 s s s s s P P 0 0 0 - - - 0 0 0 d d d d d Rd=memw_locked(Rs)
1 0 0 1 0 0 1 0 0 0 0 s s s s s P P 0 1 0 - - - 0 0 0 d d d d d Rdd=memd_locked(Rs)
Store conditional
This memory lock instruction performs a word or double-word conditional store operation.
If the address reservation is held by this thread and there have been no intervening accesses to
the memory location, the store is performed and the predicate is set to true. Otherwise, the store
is not performed and the predicate returns false. For more information, see Atomic operations.
Syntax Behavior
memd_locked(Rs,Pd)=Rtt EA=Rs;
if (lock_valid) {
*EA = Rtt;
Pd = 0xff;
lock_valid = 0;
} else {
Pd = 0;
}
memw_locked(Rs,Pd)=Rt EA=Rs;
if (lock_valid) {
*EA = Rt;
Pd = 0xff;
lock_valid = 0;
} else {
Pd = 0;
}
Notes
■ This instruction may only be grouped with ALU32 or non-floating-point XTYPE instructions.
■ The predicate generated by this instruction can not be used as a .new predicate, nor can it be
automatically ANDed with another predicate.
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Syntax Behavior
dczeroa(Rs) EA=Rs;
dcache_zero_addr(EA);
Notes
■ A packet containing this instruction must have slot 1 either empty or executing an ALU32
instruction.
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
U
ICLASS Amode Type s5 Parse
N
1 0 1 0 0 0 0 0 1 1 0 s s s s s P P 0 - - - - - - - - - - - - - dczeroa(Rs)
Memory barrier
The barrier instruction establishes a memory barrier to ensure proper ordering between
load/store accesses within a consistency domain before the barrier instruction and accesses after
the barrier instruction.
All scalar loads, stores, and cache operation by address within a consistency domain before the
barrier are globally observable before any access after the barrier can be observed.
The use of this instruction is system-dependent.
Syntax Behavior
barrier memory_barrier;
Notes
■ This is a solo instruction. It must not be grouped with other instructions in a packet.
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
U
ICLASS Amode Type Parse
N
1 0 1 0 1 0 0 0 0 0 0 - - - - - P P - - - - - - 0 0 0 - - - - - barrier
Breakpoint
The brkpt instruction causes the program to enter Debug mode if enabled by ISDB.
Execution control is handed to ISDB and the program does not proceed until directed by the
debugger.
If ISDB is disabled, this instruction is treated as a NOP.
Syntax Behavior
brkpt Enter Debug mode;
Notes
■ This is a solo instruction. It must not be grouped with other instructions in a packet.
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS sm Parse
0 1 1 0 1 1 0 0 0 0 1 - - - - - P P - - - - - - 0 0 0 - - - - - brkpt
Syntax Behavior
dcfetch(Rs) Assembler mapped to: "dcfetch(Rs+#0)"
dcfetch(Rs+#u11:3) EA=Rs+#u;
dcache_fetch(EA);
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
U
ICLASS Amode Type s5 Parse
N
1 0 0 1 0 1 0 0 0 0 0 s s s s s P P 0 - - i i i i i i i i i i i dcfetch(Rs+#u11:3)
Syntax Behavior
dccleana(Rs) EA=Rs;
dcache_clean_addr(EA);
dccleaninva(Rs) EA=Rs;
dcache_cleaninv_addr(EA);
dcinva(Rs) EA=Rs;
dcache_cleaninv_addr(EA);
Notes
■ A packet containing this instruction must have slot 1 either empty or executing an ALU32
instruction.
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Syntax Behavior
diag(Rs)
diag0(Rss,Rtt)
diag1(Rss,Rtt)
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS sm s5 Parse
0 1 1 0 0 0 1 0 0 1 0 s s s s s P P - - - - - - 0 0 1 - - - - - diag(Rs)
ICLASS sm s5 Parse t5
0 1 1 0 0 0 1 0 0 1 0 s s s s s P P - t t t t t 0 1 0 - - - - - diag0(Rss,Rtt)
0 1 1 0 0 0 1 0 0 1 0 s s s s s P P - t t t t t 0 1 1 - - - - - diag1(Rss,Rtt)
Syntax Behavior
icinva(Rs) EA=Rs;
icache_inv_addr(EA);
Notes
■ This is a solo instruction. It must not be grouped with other instructions in a packet.
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS s5 Parse
0 1 0 1 0 1 1 0 1 1 0 s s s s s P P 0 0 0 - - - - - - - - - - - icinva(Rs)
Instruction synchronization
The isync instruction ensures that all previous instructions have committed before continuing to
the next instruction.
This instruction should execute after the following events (when subsequent instructions must
observe the results of the event):
■ After modifying the TLB with a TLBW instruction
■ After modifying the SSR register
■ After modifying the SYSCFG register
■ After any instruction cache maintenance operation
■ After modifying the TID register
Syntax Behavior
isync instruction_sync;
Notes
■ This is a solo instruction. It must not be grouped with other instructions in a packet.
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS Parse
0 1 0 1 0 1 1 1 1 1 0 0 0 0 0 0 P P 0 - - - 0 0 0 0 0 0 0 0 1 0 isync
L2 cache prefetch
The L2fetch instruction initiates background prefetching into the L2 cache.
Rs specifies the 32-bit virtual start address. There are two forms of this instruction.
In the first form, the dimensions of the area to prefetch are encoded in source register Rt as
follows:
Rt[15:8] = Width of a fetch block in bytes.
Rt[7:0] = Height: the number of Width-sized blocks to fetch.
Rt[31:16] = Stride: an unsigned byte offset which is used to increment the pointer after each
Width-sized block is fetched.
In the second form, the operands are encoded in register pair Rtt as follows:
Rtt[31:16] = Width of a fetch block in bytes.
Rtt[15:0] = Height: the number of Width-sized blocks to fetch.
Rtt[47:32] = Stride: an unsigned byte offset that is used to increment the pointer after each
Width-sized block is fetched.
Rtt[48] = Direction. If clear, perform the prefetches in row major form, meaning fetch cache
lines in a row before proceeding to the next row. If the bit is set, prefetch in column major
form, meaning fetch all cache lines in a column before proceeding to the next column.
The following figure shows two examples of using the L2FETCH instruction.
L2FETCH for box prefetch L2FETCH for large linear prefetch
31 31 16 15 8 7 0
16 15 8 7 0
Stride
Width
Prefetch
Height 128* Lines
Area
In the box prefetch, a 2D range of memory is defined within a larger frame. The second example
shows prefetch for a large linear area of memory, which has size Lines * 128.
L2FETCH is nonblocking. After the instruction is initiated, the program continues on to the next
instruction while the prefetching is performed in the background. L2fetch can bring in either code
or data to the L2 cache. If the lines of interest are already in the L2, no action is performed. If the
lines are missing from the L2$, the hardware attempts to fetch them from the system memory.
The hardware prefetch engine continues to request all lines in the programmed memory range.
The prefetching hardware makes a best-effort to prefetch the requested data, and attempts to
perform prefetching at a lower priority than demand fetches. This prevents prefetch from adding
traffic while the system is under heavy load.
If a program initiates a new L2FETCH while an older L2FETCH operation is still pending, the new
request is queued, up to three deep. If three L2FETCHes are already pending, the oldest request is
dropped. During the time a L2 prefetch is active for a thread, the USR:PFA status bit is set to
indicate that prefetches are in progress. The programmer can use this bit to decide whether to
start a new L2FETCH before the previous one completes.
Executing an L2fetch with any subfield programmed as zero cancels all pending prefetches by the
calling thread.
The implementation is free to drop prefetches when needed.
Syntax Behavior
l2fetch(Rs,Rt) l2fetch(Rs,INFO);
l2fetch(Rs,Rtt) l2fetch(Rs,INFO);
Notes
■ This instruction can only be grouped with ALU32 or non-floating-point XTYPE instructions.
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Pause
The PAUSE instruction pauses execution for a specified period of time.
During the pause duration, the program enters a low-power state and does not fetch and execute
instructions. The instruction provides a short immediate that indicates the pause duration. The
program will pause for at most the number of cycles specified in the immediate plus 8. The
minimum pause is 0 cycles, and the maximum pause is implementation-defined.
An interrupt to the program exits the paused state.
System events, such as hardware or DMA completion, can trigger exits from Pause mode.
An implementation is free to pause for durations shorter than (immediate+8), but not longer.
This instruction is useful for implementing user-level low-power synchronization operations, such
as spin locks or wait-for-event signaling.
Syntax Behavior
pause(#u10) Pause for #u cycles;
Notes
■ This is a solo instruction. It must not be grouped with other instructions in a packet.
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS Parse
0 1 0 1 0 1 0 0 0 1 - - - - i i P P - i i i i i - - - i i i - - pause(#u10)
Notes
■ This is a solo instruction. It must not be grouped with other instructions in a packet.
■ This is a monitor-level feature. If performed in User or Guest mode, a privilege error
exception occurs.
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
U
ICLASS Amode Type Parse d5
N
1 0 1 0 1 0 0 0 0 0 0 - - - - - P P - - - - - 0 1 1 1 d d d d d Rd=dmsyncht
U
ICLASS Amode Type N Parse
1 0 1 0 1 0 0 0 0 1 0 - - - - - P P - - - - - - - - - - - - - - syncht
Syntax Behavior
trace(Rs) Send value to ETM trace;
Notes
■ This instruction may only be grouped with ALU32 or non-floating-point XTYPE instructions.
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS sm s5 Parse
0 1 1 0 0 0 1 0 0 1 0 s s s s s P P - - - - - - 0 0 0 - - - - - trace(Rs)
Trap
The trap instruction causes a precise exception.
Executing a trap instruction sets the EX bit in SSR to 1, which disables interrupts and enables
Supervisor mode. The program then jumps to the vector location (either TRAP0 or TRAP1). The
instruction specifies a n 8-bit immediate field. This field is copied into the system status register
cause field.
Upon returning from the service routine with a RTE, execution resumes at the packet after the
TRAP instruction.
These instructions are generally intended for user code to request services from the operating
system. Two TRAP instructions are provided so the OS can optimize for fast service routines and
slower service routines.
Syntax Behavior
trap0(#u8) SSR.CAUSE = #u;
TRAP "0";
trap1(#u8) Assembler mapped to: "trap1(R0,#u8)"
trap1(Rx,#u8) if (!can_handle_trap1_virtinsn(#u)) {
SSR.CAUSE = #u;
TRAP "1";
} else if (#u == 1) {
VMRTE;
} else if (#u == 3) {
VMSETIE;
} else if (#u == 4) {
VMGETIE;
} else if (#u == 6) {
VMSPSWAP;
Notes
■ This is a solo instruction. It must not be grouped with other instructions in a packet.
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS Parse
0 1 0 1 0 1 0 0 0 0 - - - - - - P P - i i i i i - - - i i i - - trap0(#u8)
ICLASS x5 Parse
0 1 0 1 0 1 0 0 1 0 - x x x x x P P - i i i i i - - - i i i - - trap1(Rx,#u8)
Unpause
The unpause instruction resumes threads whose execution has stalled with a pause instruction.
Syntax Behavior
unpause Unpause threads currently in pause state;
Notes
■ This is a solo instruction. It must not be grouped with other instructions in a packet.
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS Parse
0 1 0 1 0 1 1 1 1 1 1 - - - - - P P 0 1 - - - - 0 0 0 - - - - - unpause
11.10 XTYPE
The XTYPE instruction class includes instructions that perform most of the data processing done
by the Hexagon processor.
XTYPE instructions are executable on slot 2 or slot 3.
Syntax Behavior
Rdd=abs(Rss) Rdd = ABS(Rss);
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 0 0 0 0 1 0 0 s s s s s P P - - - - - - 1 1 0 d d d d d Rdd=abs(Rss)
Syntax Behavior
Rd=abs(Rs)[:sat] Rd = [sat32](ABS(sxt32->64(Rs)));
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 1 1 0 0 1 0 0 s s s s s P P - - - - - - 1 0 0 d d d d d Rd=abs(Rs)
1 0 0 0 1 1 0 0 1 0 0 s s s s s P P - - - - - - 1 0 1 d d d d d Rd=abs(Rs):sat
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse d5 u5
1 1 0 1 1 0 1 1 0 i i s s s s s P P i d d d d d i i i u u u u u Rd=add(Rs,add(Ru,#s6))
1 1 0 1 1 0 1 1 1 i i s s s s s P P i d d d d d i i i u u u u u Rd=add(Rs,sub(#s6,Ru))
ICLASS RegType MajOp s5 Parse MinOp x5
1 1 1 0 0 0 1 0 0 - - s s s s s P P 0 i i i i i i i i x x x x x Rx+=add(Rs,#s8)
1 1 1 0 0 0 1 0 1 - - s s s s s P P 0 i i i i i i i i x x x x x Rx-=add(Rs,#s8)
ICLASS RegType MajOp s5 Parse t5 MinOp x5
1 1 1 0 1 1 1 1 0 0 0 s s s s s P P 0 t t t t t 0 0 1 x x x x x Rx+=add(Rs,Rt)
1 1 1 0 1 1 1 1 1 0 0 s s s s s P P 0 t t t t t 0 0 1 x x x x x Rx-=add(Rs,Rt)
Add doublewords
The first form of this instruction adds two 32-bit registers. If the result overflows 32 bits, the
result is saturated to 0x7FFF_FFFF for a positive result, or 0x8000_0000 for a negative result. A 32-
bit nonsaturating register add is a ALU32-class instruction and can execute on any slot.
The second instruction form sign-extends a 32-bit register Rt to 64-bits and performs a 64-bit add
with Rss. The result is stored in Rdd.
The third instruction form adds 64-bit registers Rss and Rtt and places the result in Rdd.
The final instruction form adds two 64-bit registers Rss and Rtt. If the result overflows 64 bits, it is
saturated to 0x7fff_ffff_ffff_ffff for a positive result, or 0x8000_0000_0000_0000 for a negative
result.
Syntax Behavior
Rd=add(Rs,Rt):sat:depreca Rd=sat32(Rs+Rt);
ted
Rdd=add(Rs,Rtt) if ("Rs & 1") {
Assembler mapped to: "Rdd=add(Rss,Rtt):raw:hi";
} else {
Assembler mapped to: "Rdd=add(Rss,Rtt):raw:lo";
}
Rdd=add(Rss,Rtt) Rdd=Rss+Rtt;
Rdd=add(Rss,Rtt):raw:hi Rdd=Rtt+sxt32->64(Rss.w[1]);
Rdd=add(Rss,Rtt):raw:lo Rdd=Rtt+sxt32->64(Rss.w[0]);
Rdd=add(Rss,Rtt):sat Rdd=sat64(Rss+Rtt);
Notes
■ If saturation occurs during execution of this instruction (a result is clamped to either
maximum or minimum values), the OVF bit in the Status Register is set. OVF remains set until
explicitly cleared by a transfer to the status register.
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 MinOp d5
1 1 0 1 0 0 1 1 0 0 0 s s s s s P P - t t t t t 1 1 1 d d d d d Rdd=add(Rss,Rtt)
1 1 0 1 0 0 1 1 0 1 1 s s s s s P P - t t t t t 1 0 1 d d d d d Rdd=add(Rss,Rtt):sat
1 1 0 1 0 0 1 1 0 1 1 s s s s s P P - t t t t t 1 1 0 d d d d d Rdd=add(Rss,Rtt):raw:lo
1 1 0 1 0 0 1 1 0 1 1 s s s s s P P - t t t t t 1 1 1 d d d d d Rdd=add(Rss,Rtt):raw:hi
Rd=add(Rs,Rt):sat:depreca
1 1 0 1 0 1 0 1 1 0 0 s s s s s P P - t t t t t 0 - - d d d d d ted
Add halfword
Perform a 16-bit add with optional saturation, and place the result in either the upper or lower
half of a register. If the result goes in the upper half, the sources are any high or low halfword of
Rs and Rt. The lower 16 bits of the result are zeroed.
If the result is placed in the lower 16 bits of Rd, the Rs source can be either high or low, but the
other source must be the low halfword of Rt. In this case, the upper halfword of Rd is the sign-
extension of the low halfword.
Rd=add(Rs.[hl],Rt.[hl])[:sat]
Mux Mux
16-bit Add
0x7FFF 0x8000
Saturate
Sign-extend Result Rd
Syntax Behavior
Rd=add(Rt.L,Rs.[HL])[:sat] Rd=[sat16](Rt.h[0]+Rs.h[01]);
Rd=add(Rt.[HL],Rs.[HL])[:sat]: Rd=([sat16](Rt.h[01]+Rs.h[01]))<<16;
<<16
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 MinOp d5
1 1 0 1 0 1 0 1 0 0 0 s s s s s P P - t t t t t 0 0 - d d d d d Rd=add(Rt.L,Rs.L)
1 1 0 1 0 1 0 1 0 0 0 s s s s s P P - t t t t t 0 1 - d d d d d Rd=add(Rt.L,Rs.H)
1 1 0 1 0 1 0 1 0 0 0 s s s s s P P - t t t t t 1 0 - d d d d d Rd=add(Rt.L,Rs.L):sat
1 1 0 1 0 1 0 1 0 0 0 s s s s s P P - t t t t t 1 1 - d d d d d Rd=add(Rt.L,Rs.H):sat
1 1 0 1 0 1 0 1 0 1 0 s s s s s P P - t t t t t 0 0 0 d d d d d Rd=add(Rt.L,Rs.L):<<16
1 1 0 1 0 1 0 1 0 1 0 s s s s s P P - t t t t t 0 0 1 d d d d d Rd=add(Rt.L,Rs.H):<<16
1 1 0 1 0 1 0 1 0 1 0 s s s s s P P - t t t t t 0 1 0 d d d d d Rd=add(Rt.H,Rs.L):<<16
1 1 0 1 0 1 0 1 0 1 0 s s s s s P P - t t t t t 0 1 1 d d d d d Rd=add(Rt.H,Rs.H):<<16
Rd=add(Rt.L,Rs.L):sat:<<1
1 1 0 1 0 1 0 1 0 1 0 s s s s s P P - t t t t t 1 0 0 d d d d d 6
1 1 0 1 0 1 0 1 0 1 0 s s s s s P P - t t t t t 1 0 1 d d d d d Rd=add(Rt.L,Rs.H):sat:<<1
6
1 1 0 1 0 1 0 1 0 1 0 s s s s s P P - t t t t t 1 1 0 d d d d d Rd=add(Rt.H,Rs.L):sat:<<1
6
Rd=add(Rt.H,Rs.H):sat:<<1
1 1 0 1 0 1 0 1 0 1 0 s s s s s P P - t t t t t 1 1 1 d d d d d 6
Syntax Behavior
Rdd=add(Rss,Rtt,Px):carry PREDUSE_TIMING;
Rdd = Rss + Rtt + Px[0];
Px = carry_from_add(Rss,Rtt,Px[0]) ? 0xff : 0x00;
Rdd=sub(Rss,Rtt,Px):carry PREDUSE_TIMING;
Rdd = Rss + ~Rtt + Px[0];
Px = carry_from_add(Rss,~Rtt,Px[0]) ? 0xff : 0x00;
Notes
■ The predicate generated by this instruction can not be used as a .new predicate, nor can it be
automatically ANDed with another predicate.
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType Maj s5 Parse t5 x2 d5
1 1 0 0 0 0 1 0 1 1 0 s s s s s P P - t t t t t - x x d d d d d Rdd=add(Rss,Rtt,Px):carry
1 1 0 0 0 0 1 0 1 1 1 s s s s s P P - t t t t t - x x d d d d d Rdd=sub(Rss,Rtt,Px):carry
Clip to unsigned
Clip input to unsigned integer.
Syntax Behavior
Rd=clip(Rs,#u5) Rd=MIN((1<<#u)- 1,MAX(Rs,-(1<<#u)));
Notes
■ This instruction can only execute on a core with the Hexagon audio extensions
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 1 0 0 0 1 1 0 s s s s s P P 0 i i i i i 1 0 1 d d d d d Rd=clip(Rs,#u5)
Logical doublewords
Perform bitwise logical AND, OR, XOR, and NOT operations.
The source and destination registers are 64-bit.
For 32-bit logical operations, see the ALU32 logical instructions.
Syntax Behavior
Rdd=and(Rss,Rtt) Rdd=Rss&Rtt;
Rdd=and(Rtt,~Rss) Rdd = (Rtt & ~Rss);
Rdd=not(Rss) Rdd=~Rss;
Rdd=or(Rss,Rtt) Rdd=Rss|Rtt;
Rdd=or(Rtt,~Rss) Rdd = (Rtt | ~Rss);
Rdd=xor(Rss,Rtt) Rdd=Rss^Rtt;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 0 0 0 0 1 0 0 s s s s s P P - - - - - - 1 0 0 d d d d d Rdd=not(Rss)
ICLASS RegType s5 Parse t5 MinOp d5
1 1 0 1 0 0 1 1 1 1 1 s s s s s P P - t t t t t 0 0 0 d d d d d Rdd=and(Rss,Rtt)
1 1 0 1 0 0 1 1 1 1 1 s s s s s P P - t t t t t 0 0 1 d d d d d Rdd=and(Rtt,~Rss)
1 1 0 1 0 0 1 1 1 1 1 s s s s s P P - t t t t t 0 1 0 d d d d d Rdd=or(Rss,Rtt)
1 1 0 1 0 0 1 1 1 1 1 s s s s s P P - t t t t t 0 1 1 d d d d d Rdd=or(Rtt,~Rss)
1 1 0 1 0 0 1 1 1 1 1 s s s s s P P - t t t t t 1 0 0 d d d d d Rdd=xor(Rss,Rtt)
Logical-logical doublewords
Perform a logical operation of the two source operands, then perform a second logical operation
of the result with the destination register Rxx.
The source and destination registers are 64-bit.
Syntax Behavior
Rxx^=xor(Rss,Rtt) Rxx^=Rss^Rtt;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType Maj s5 Parse t5 Min x5
1 1 0 0 1 0 1 0 1 0 - s s s s s P P 0 t t t t t 0 0 0 x x x x x Rxx^=xor(Rss,Rtt)
Logical-logical words
Perform a logical operation of the two source operands, then perform a second logical operation
of the result with the destination register Rx.
The source and destination registers are 32-bit.
Syntax Behavior
Rx=or(Ru,and(Rx,#s10)) Rx = Ru | (Rx & apply_extension(#s));
Rx[&|^]=and(Rs,Rt) Rx [|&^]= (Rs [|&^] Rt);
Rx[&|^]=and(Rs,~Rt) Rx [|&^]= (Rs [|&^] ~Rt);
Rx[&|^]=or(Rs,Rt) Rx [|&^]= (Rs [|&^] Rt);
Rx[&|^]=xor(Rs,Rt) Rx[|&^]=Rs[|&^]Rt;
Rx|=and(Rs,#s10) Rx = Rx | (Rs & apply_extension(#s));
Rx|=or(Rs,#s10) Rx = Rx | (Rs | apply_extension(#s));
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse x5
1 1 0 1 1 0 1 0 0 0 i s s s s s P P i i i i i i i i i x x x x x Rx|=and(Rs,#s10)
ICLASS RegType x5 Parse u5
1 1 0 1 1 0 1 0 0 1 i x x x x x P P i i i i i i i i i u u u u u Rx=or(Ru,and(Rx,#s10))
ICLASS RegType s5 Parse x5
1 1 0 1 1 0 1 0 1 0 i s s s s s P P i i i i i i i i i x x x x x Rx|=or(Rs,#s10)
ICLASS RegType MajOp s5 Parse t5 MinOp x5
1 1 1 0 1 1 1 1 0 0 1 s s s s s P P 0 t t t t t 0 0 0 x x x x x Rx|=and(Rs,~Rt)
1 1 1 0 1 1 1 1 0 0 1 s s s s s P P 0 t t t t t 0 0 1 x x x x x Rx&=and(Rs,~Rt)
1 1 1 0 1 1 1 1 0 0 1 s s s s s P P 0 t t t t t 0 1 0 x x x x x Rx^=and(Rs,~Rt)
1 1 1 0 1 1 1 1 0 1 0 s s s s s P P 0 t t t t t 0 0 0 x x x x x Rx&=and(Rs,Rt)
1 1 1 0 1 1 1 1 0 1 0 s s s s s P P 0 t t t t t 0 0 1 x x x x x Rx&=or(Rs,Rt)
1 1 1 0 1 1 1 1 0 1 0 s s s s s P P 0 t t t t t 0 1 0 x x x x x Rx&=xor(Rs,Rt)
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
1 1 1 0 1 1 1 1 0 1 0 s s s s s P P 0 t t t t t 0 1 1 x x x x x Rx|=and(Rs,Rt)
1 1 1 0 1 1 1 1 1 0 0 s s s s s P P 0 t t t t t 0 1 1 x x x x x Rx^=xor(Rs,Rt)
1 1 1 0 1 1 1 1 1 1 0 s s s s s P P 0 t t t t t 0 0 0 x x x x x Rx|=or(Rs,Rt)
1 1 1 0 1 1 1 1 1 1 0 s s s s s P P 0 t t t t t 0 0 1 x x x x x Rx|=xor(Rs,Rt)
1 1 1 0 1 1 1 1 1 1 0 s s s s s P P 0 t t t t t 0 1 0 x x x x x Rx^=and(Rs,Rt)
1 1 1 0 1 1 1 1 1 1 0 s s s s s P P 0 t t t t t 0 1 1 x x x x x Rx^=or(Rs,Rt)
Maximum words
Select either the signed or unsigned maximum of two source registers and place in a destination
register Rdd.
Syntax Behavior
Rd=max(Rs,Rt) Rd = max(Rs,Rt);
Rd=maxu(Rs,Rt) Rd = max(Rs.uw[0],Rt.uw[0]);
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 MinOp d5
1 1 0 1 0 1 0 1 1 1 0 s s s s s P P - t t t t t 0 - - d d d d d Rd=max(Rs,Rt)
1 1 0 1 0 1 0 1 1 1 0 s s s s s P P - t t t t t 1 - - d d d d d Rd=maxu(Rs,Rt)
Maximum doublewords
Select either the signed or unsigned maximum of two 64-bit source registers and place in a
destination register.
Syntax Behavior
Rdd=max(Rss,Rtt) Rdd = max(Rss,Rtt);
Rdd=maxu(Rss,Rtt) Rdd = max(Rss.u64,Rtt.u64);
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 MinOp d5
1 1 0 1 0 0 1 1 1 1 0 s s s s s P P - t t t t t 1 0 0 d d d d d Rdd=max(Rss,Rtt)
1 1 0 1 0 0 1 1 1 1 0 s s s s s P P - t t t t t 1 0 1 d d d d d Rdd=maxu(Rss,Rtt)
Minimum words
Select either the signed or unsigned minimum of two source registers and place in destination
register Rd.
Syntax Behavior
Rd=min(Rt,Rs) Rd = min(Rt,Rs);
Rd=minu(Rt,Rs) Rd = min(Rt.uw[0],Rs.uw[0]);
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 MinOp d5
1 1 0 1 0 1 0 1 1 0 1 s s s s s P P - t t t t t 0 - - d d d d d Rd=min(Rt,Rs)
1 1 0 1 0 1 0 1 1 0 1 s s s s s P P - t t t t t 1 - - d d d d d Rd=minu(Rt,Rs)
Minimum doublewords
Select either the signed or unsigned minimum of two 64-bit source registers and place in the
destination register Rdd.
Syntax Behavior
Rdd=min(Rtt,Rss) Rdd = min(Rtt,Rss);
Rdd=minu(Rtt,Rss) Rdd = min(Rtt.u64,Rss.u64);
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 MinOp d5
1 1 0 1 0 0 1 1 1 0 1 s s s s s P P - t t t t t 1 1 0 d d d d d Rdd=min(Rtt,Rss)
1 1 0 1 0 0 1 1 1 0 1 s s s s s P P - t t t t t 1 1 1 d d d d d Rdd=minu(Rtt,Rss)
Modulo wrap
Wrap the Rs value into the modulo range from 0 to Rt.
If Rs is greater than or equal to Rt, wrap it to the bottom of the range by subtracting Rt.
If Rs is less than zero, wrap it to the top of the range by adding Rt.
Otherwise, when Rs fits within the range, no adjustment is necessary. The result is returned in
register Rd.
Syntax Behavior
Rd=modwrap(Rs,Rt) if (Rs < 0) {
Rd = Rs + Rt.uw[0];
} else if (Rs.uw[0] >= Rt.uw[0]) {
Rd = Rs - Rt.uw[0];
} else {
Rd = Rs;
}
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 MinOp d5
1 1 0 1 0 0 1 1 1 1 1 s s s s s P P - t t t t t 1 1 1 d d d d d Rd=modwrap(Rs,Rt)
Negate
The first form of this instruction performs a negate on a 32-bit register with saturation. If the
input is 0x80000000, the result is saturated to 0x7fffffff. The nonsaturating 32-bit register negate
is a ALU32-class instruction and can execute on any slot.
The second form of this instruction negates a 64-bit source register and places the result in
destination Rdd.
Syntax Behavior
Rd=neg(Rs):sat Rd = sat32(-Rs.s64);
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 0 0 0 0 1 0 0 s s s s s P P - - - - - - 1 0 1 d d d d d Rdd=neg(Rss)
1 0 0 0 1 1 0 0 1 0 0 s s s s s P P - - - - - - 1 1 0 d d d d d Rd=neg(Rs):sat
Round
Perform either arithmetic (.5 is rounded up) or convergent (.5 is rounded towards even) rounding
to any bit location.
Arithmetic rounding has optional saturation. In this version, the result is saturated to a 32-bit
number after adding the rounding constant. After the rounding and saturation have been
performed, the final result is right shifted using a sign-extending shift.
Syntax Behavior
Rd=cround(Rs,#u5) Rd = (#u==0)?Rs:convround(Rs,2**(#u-1))>>#u;
Rd=cround(Rs,Rt) Rd = (zxt5->32(Rt)==0)?Rs:convround(Rs,2**(zxt5->32(Rt)-
1))>>zxt5->32(Rt);
Rd=round(Rs,#u5)[:sa Rd = ([sat32]((#u==0)?(Rs):round(Rs,2**(#u-1))))>>#u;
t]
Rd=round(Rs,Rt)[:sat Rd = ([sat32]((zxt5->32(Rt)==0)?(Rs):round(Rs,2**(zxt5-
] >32(Rt)-1))))>>zxt5->32(Rt);
Rd=round(Rss):sat tmp=sat64(Rss+0x080000000ULL);
Rd = tmp.w[1];
Rdd=cround(Rss,#u6) if (#u == 0) {
Rdd = Rss;
} else if ((Rss & (size8s_t)((1LL << (#u - 1)) - 1LL)) ==
0) {
src_128 = sxt64->128(Rss);
rndbit_128 = sxt64->128(1LL);
rndbit_128 = (rndbit_128 << #u);
rndbit_128 = (rndbit_128 & src_128);
rndbit_128 = (size8s_t) (rndbit_128 >> 1);
tmp128 = src_128+rndbit_128;
tmp128 = (size8s_t) (tmp128 >> #u);
Rdd = sxt128->64(tmp128);
} else {
size16s_t rndbit_128 = sxt64->128((1LL << (#u - 1)));
size16s_t src_128 = sxt64->128(Rss);
size16s_t tmp128 = src_128+rndbit_128;
tmp128 = (size8s_t) (tmp128 >> #u);
Rdd = sxt128->64(tmp128);
}
Syntax Behavior
Rdd=cround(Rss,Rt) if (zxt6->32(Rt) == 0) {
Rdd = Rss;
} else if ((Rss & (size8s_t)((1LL << (zxt6->32(Rt) - 1)) -
1LL)) == 0) {
src_128 = sxt64->128(Rss);
rndbit_128 = sxt64->128(1LL);
rndbit_128 = (rndbit_128 << zxt6->32(Rt));
rndbit_128 = (rndbit_128 & src_128);
rndbit_128 = (size8s_t) (rndbit_128 >> 1);
tmp128 = src_128+rndbit_128;
tmp128 = (size8s_t) (tmp128 >> zxt6->32(Rt));
Rdd = sxt128->64(tmp128);
} else {
size16s_t rndbit_128 = sxt64->128((1LL << (zxt6->32(Rt)
- 1)));
size16s_t src_128 = sxt64->128(Rss);
size16s_t tmp128 = src_128+rndbit_128;
tmp128 = (size8s_t) (tmp128 >> zxt6->32(Rt));
Rdd = sxt128->64(tmp128);
}
Notes
■ This instruction can only execute on a core with the Hexagon audio extensions
■ If saturation occurs during execution of this instruction (a result is clamped to either
maximum or minimum values), the OVF bit in the status register is set. OVF remains set until
explicitly cleared by a transfer to the status register.
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 1 0 0 0 1 1 0 s s s s s P P - - - - - - 0 0 1 d d d d d Rd=round(Rss):sat
1 0 0 0 1 1 0 0 1 1 1 s s s s s P P 0 i i i i i 0 0 - d d d d d Rd=cround(Rs,#u5)
1 0 0 0 1 1 0 0 1 1 1 s s s s s P P 0 i i i i i 1 0 - d d d d d Rd=round(Rs,#u5)
1 0 0 0 1 1 0 0 1 1 1 s s s s s P P 0 i i i i i 1 1 - d d d d d Rd=round(Rs,#u5):sat
1 0 0 0 1 1 0 0 1 1 1 s s s s s P P i i i i i i 0 1 - d d d d d Rdd=cround(Rss,#u6)
ICLASS RegType Maj s5 Parse t5 Min d5
1 1 0 0 0 1 1 0 1 1 - s s s s s P P - t t t t t 0 0 - d d d d d Rd=cround(Rs,Rt)
1 1 0 0 0 1 1 0 1 1 - s s s s s P P - t t t t t 0 1 - d d d d d Rdd=cround(Rss,Rt)
1 1 0 0 0 1 1 0 1 1 - s s s s s P P - t t t t t 1 0 - d d d d d Rd=round(Rs,Rt)
1 1 0 0 0 1 1 0 1 1 - s s s s s P P - t t t t t 1 1 - d d d d d Rd=round(Rs,Rt):sat
Subtract doublewords
Subtract the 64-bit register Rss from register Rtt.
Syntax Behavior
Rd=sub(Rt,Rs):sat:deprecated Rd=sat32(Rt - Rs);
Rdd=sub(Rtt,Rss) Rdd=Rtt-Rss;
Notes
■ If saturation occurs during execution of this instruction (a result is clamped to either
maximum or minimum values), the OVF bit in the status register is set. OVF remains set until
explicitly cleared by a transfer to the status register.
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 MinOp d5
1 1 0 1 0 0 1 1 0 0 1 s s s s s P P - t t t t t 1 1 1 d d d d d Rdd=sub(Rtt,Rss)
1 1 0 1 0 1 0 1 1 0 0 s s s s s P P - t t t t t 1 - - d d d d d Rd=sub(Rt,Rs):sat:depreca
ted
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp x5
1 1 1 0 1 1 1 1 0 0 0 s s s s s P P 0 t t t t t 0 1 1 x x x x x Rx+=sub(Rt,Rs)
Subtract halfword
Perform a 16-bit subtract with optional saturation and place the result in either the upper or
lower half of a register. If the result goes in the upper half, the sources can be any high or low
halfword of Rs and Rt. The lower 16 bits of the result are zeroed.
If the result is placed in the lower 16 bits of Rd, the Rs source can be either high or low, but the
other source must be the low halfword of Rt. In this case, the upper halfword of Rd is the sign-
extension of the low halfword.
Rd=sub(Rt.[hl],Rs.l)[:sat] Rd=sub(Rt.[hl],Rs.[hl])[:sat]:<<16
Mux Rs Mux Rs
Mux
Saturate Saturate
Syntax Behavior
Rd=sub(Rt.L,Rs.[HL])[:sat] Rd=[sat16](Rt.h[0]-Rs.h[01]);
Rd=sub(Rt.[HL],Rs.[HL])[:sat]:<< Rd=([sat16](Rt.h[01]-Rs.h[01]))<<16;
16
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 MinOp d5
1 1 0 1 0 1 0 1 0 0 1 s s s s s P P - t t t t t 0 0 - d d d d d Rd=sub(Rt.L,Rs.L)
1 1 0 1 0 1 0 1 0 0 1 s s s s s P P - t t t t t 0 1 - d d d d d Rd=sub(Rt.L,Rs.H)
1 1 0 1 0 1 0 1 0 0 1 s s s s s P P - t t t t t 1 0 - d d d d d Rd=sub(Rt.L,Rs.L):sat
1 1 0 1 0 1 0 1 0 0 1 s s s s s P P - t t t t t 1 1 - d d d d d Rd=sub(Rt.L,Rs.H):sat
1 1 0 1 0 1 0 1 0 1 1 s s s s s P P - t t t t t 0 0 0 d d d d d Rd=sub(Rt.L,Rs.L):<<16
1 1 0 1 0 1 0 1 0 1 1 s s s s s P P - t t t t t 0 0 1 d d d d d Rd=sub(Rt.L,Rs.H):<<16
1 1 0 1 0 1 0 1 0 1 1 s s s s s P P - t t t t t 0 1 0 d d d d d Rd=sub(Rt.H,Rs.L):<<16
1 1 0 1 0 1 0 1 0 1 1 s s s s s P P - t t t t t 0 1 1 d d d d d Rd=sub(Rt.H,Rs.H):<<16
Rd=sub(Rt.L,Rs.L):sat:<<1
1 1 0 1 0 1 0 1 0 1 1 s s s s s P P - t t t t t 1 0 0 d d d d d 6
1 1 0 1 0 1 0 1 0 1 1 s s s s s P P - t t t t t 1 0 1 d d d d d Rd=sub(Rt.L,Rs.H):sat:<<1
6
1 1 0 1 0 1 0 1 0 1 1 s s s s s P P - t t t t t 1 1 0 d d d d d Rd=sub(Rt.H,Rs.L):sat:<<1
6
Rd=sub(Rt.H,Rs.H):sat:<<1
1 1 0 1 0 1 0 1 0 1 1 s s s s s P P - t t t t t 1 1 1 d d d d d 6
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 0 1 0 0 0 1 - s s s s s P P - - - - - - 0 0 - d d d d d Rdd=sxtw(Rs)
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 0 0 0 0 0 1 0 s s s s s P P - - - - - - 1 0 0 d d d d d Rdd=vabsh(Rss)
1 0 0 0 0 0 0 0 0 1 0 s s s s s P P - - - - - - 1 0 1 d d d d d Rdd=vabsh(Rss):sat
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 0 0 0 0 0 1 0 s s s s s P P - - - - - - 1 1 0 d d d d d Rdd=vabsw(Rss)
1 0 0 0 0 0 0 0 0 1 0 s s s s s P P - - - - - - 1 1 1 d d d d d Rdd=vabsw(Rss):sat
Syntax Behavior
Rdd=vabsdiffb(Rtt,Rss) for (i=0;i<8;i++) {
Rdd.b[i]=ABS(Rtt.b[i] - Rss.b[i]);
}
Rdd=vabsdiffub(Rtt,Rss) for (i=0;i<8;i++) {
Rdd.b[i]=ABS(Rtt.ub[i] - Rss.ub[i]);
}
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
1 1 1 0 1 0 0 0 1 0 1 s s s s s P P 0 t t t t t 0 0 0 d d d d d Rdd=vabsdiffub(Rtt,Rss)
1 1 1 0 1 0 0 0 1 1 1 s s s s s P P 0 t t t t t 0 0 0 d d d d d Rdd=vabsdiffb(Rtt,Rss)
Syntax Behavior
Rdd=vabsdiffh(Rtt,Rss) for (i=0;i<4;i++) {
Rdd.h[i]=ABS(Rtt.h[i] - Rss.h[i]);
}
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
1 1 1 0 1 0 0 0 0 1 1 s s s s s P P 0 t t t t t 0 0 0 d d d d d Rdd=vabsdiffh(Rtt,Rss)
Syntax Behavior
Rdd=vabsdiffw(Rtt,Rss) for (i=0;i<2;i++) {
Rdd.w[i]=ABS(Rtt.w[i] - Rss.w[i]);
}
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
1 1 1 0 1 0 0 0 0 0 1 s s s s s P P 0 t t t t t 0 0 0 d d d d d Rdd=vabsdiffw(Rtt,Rss)
+ - + - + - + -
Rxx,Pd=vacsh(Rss,Rtt)
sat16 sat16 sat16 sat16 sat16 sat16 sat16 sat16
>
>
>
1 0
> 1bit
16bits 1 0
1 0
1 0
Syntax Behavior
Class: N/A
+ - + - + - + -
Rxx,Pd=vacsh(Rss,Rtt)
sat16 sat16 sat16 sat16 sat16 sat16 sat16 sat16
>
>
>
1 0
> 1bit
16bits 1 0
1 0
1 0
Syntax Behavior
Rxx,Pe=vacsh(Rss,Rtt) for (i = 0; i < 4; i++) {
xv = (int) Rxx.h[i];
sv = (int) Rss.h[i];
tv = (int) Rtt.h[i];
xv = xv + tv;
sv = sv - tv;
Pe.i*2 = (xv > sv);
Pe.i*2+1 = (xv > sv);
Rxx.h[i]=sat16(max(xv,sv));
}
Notes
■ The predicate generated by this instruction cannot be used as a .new predicate, nor can it be
automatically AND’d with another predicate.
■ If saturation occurs during execution of this instruction (a result is clamped to either
maximum or minimum values), the OVF bit in the status register is set. OVF remains set until
explicitly cleared by a transfer to the status register.
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 e2 x5
1 1 1 0 1 0 1 0 1 0 1 s s s s s P P 0 t t t t t 0 e e x x x x x Rxx,Pe=vacsh(Rss,Rtt)
Notes
■ If saturation occurs during execution of this instruction (a result is clamped to either
maximum or minimum values), the OVF bit in the status register is set. OVF remains set until
explicitly cleared by a transfer to the status register.
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 MinOp d5
1 1 0 1 0 0 1 1 0 0 0 s s s s s P P - t t t t t 0 1 0 d d d d d Rdd=vaddh(Rss,Rtt)
1 1 0 1 0 0 1 1 0 0 0 s s s s s P P - t t t t t 0 1 1 d d d d d Rdd=vaddh(Rss,Rtt):sat
1 1 0 1 0 0 1 1 0 0 0 s s s s s P P - t t t t t 1 0 0 d d d d d Rdd=vadduh(Rss,Rtt):sat
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType Maj s5 Parse t5 Min d5
1 1 0 0 0 0 0 1 0 1 - s s s s s P P - t t t t t 0 0 1 d d d d d Rd=vaddhub(Rss,Rtt):sat
Rss
Rtt
+ + + + + + + +
Rdd
Syntax Behavior
Rdd=vraddub(Rss,Rtt) Rdd = 0;
for (i=0;i<4;i++) {
Rdd.w[0]=(Rdd.w[0] + (Rss.ub[i]+Rtt.ub[i]));
}
for (i=4;i<8;i++) {
Rdd.w[1]=(Rdd.w[1] + (Rss.ub[i]+Rtt.ub[i]));
}
Rxx+=vraddub(Rss,Rtt) for (i = 0; i < 4; i++) {
Rxx.w[0]=(Rxx.w[0] + (Rss.ub[i]+Rtt.ub[i]));
}
for (i = 4; i < 8; i++) {
Rxx.w[1]=(Rxx.w[1] + (Rss.ub[i]+Rtt.ub[i]));
}
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
1 1 1 0 1 0 0 0 0 1 0 s s s s s P P 0 t t t t t 0 0 1 d d d d d Rdd=vraddub(Rss,Rtt)
ICLASS RegType MajOp s5 Parse t5 MinOp x5
1 1 1 0 1 0 1 0 0 1 0 s s s s s P P 0 t t t t t 0 0 1 x x x x x Rxx+=vraddub(Rss,Rtt)
Rtt
+ + + +
Rd
Syntax Behavior
Rd=vraddh(Rss,Rtt) Rd = 0;
for (i=0;i<4;i++) {
Rd += (Rss.h[i]+Rtt.h[i]);
}
Rd=vradduh(Rss,Rtt) Rd = 0;
for (i=0;i<4;i++) {
Rd += (Rss.uh[i]+Rtt.uh[i]);
}
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
1 1 1 0 1 0 0 1 0 - - s s s s s P P 0 t t t t t - 0 1 d d d d d Rd=vradduh(Rss,Rtt)
1 1 1 0 1 0 0 1 0 - 1 s s s s s P P 0 t t t t t 1 1 1 d d d d d Rd=vraddh(Rss,Rtt)
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 MinOp d5
1 1 0 1 0 0 1 1 0 0 0 s s s s s P P - t t t t t 0 0 0 d d d d d Rdd=vaddub(Rss,Rtt)
1 1 0 1 0 0 1 1 0 0 0 s s s s s P P - t t t t t 0 0 1 d d d d d Rdd=vaddub(Rss,Rtt):sat
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 MinOp d5
1 1 0 1 0 0 1 1 0 0 0 s s s s s P P - t t t t t 1 0 1 d d d d d Rdd=vaddw(Rss,Rtt)
1 1 0 1 0 0 1 1 0 0 0 s s s s s P P - t t t t t 1 1 0 d d d d d Rdd=vaddw(Rss,Rtt):sat
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 MinOp d5
1 1 0 1 0 0 1 1 0 1 0 s s s s s P P - t t t t t 0 1 0 d d d d d Rdd=vavgh(Rss,Rtt)
1 1 0 1 0 0 1 1 0 1 0 s s s s s P P - t t t t t 0 1 1 d d d d d Rdd=vavgh(Rss,Rtt):rnd
1 1 0 1 0 0 1 1 0 1 0 s s s s s P P - t t t t t 1 0 0 d d d d d Rdd=vavgh(Rss,Rtt):crnd
1 1 0 1 0 0 1 1 0 1 0 s s s s s P P - t t t t t 1 0 1 d d d d d Rdd=vavguh(Rss,Rtt)
1 1 0 1 0 0 1 1 0 1 0 s s s s s P P - t t t t t 1 1 - d d d d d Rdd=vavguh(Rss,Rtt):rnd
1 1 0 1 0 0 1 1 1 0 0 s s s s s P P - t t t t t 0 0 0 d d d d d Rdd=vnavgh(Rtt,Rss)
Rdd=vnavgh(Rtt,Rss):rnd:s
1 1 0 1 0 0 1 1 1 0 0 s s s s s P P - t t t t t 0 0 1 d d d d d at
1 1 0 1 0 0 1 1 1 0 0 s s s s s P P - t t t t t 0 1 0 d d d d d Rdd=vnavgh(Rtt,Rss):crnd:
sat
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 MinOp d5
1 1 0 1 0 0 1 1 0 1 0 s s s s s P P - t t t t t 0 0 0 d d d d d Rdd=vavgub(Rss,Rtt)
1 1 0 1 0 0 1 1 0 1 0 s s s s s P P - t t t t t 0 0 1 d d d d d Rdd=vavgub(Rss,Rtt):rnd
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 MinOp d5
1 1 0 1 0 0 1 1 0 1 1 s s s s s P P - t t t t t 0 0 0 d d d d d Rdd=vavgw(Rss,Rtt)
1 1 0 1 0 0 1 1 0 1 1 s s s s s P P - t t t t t 0 0 1 d d d d d Rdd=vavgw(Rss,Rtt):rnd
1 1 0 1 0 0 1 1 0 1 1 s s s s s P P - t t t t t 0 1 0 d d d d d Rdd=vavgw(Rss,Rtt):crnd
1 1 0 1 0 0 1 1 0 1 1 s s s s s P P - t t t t t 0 1 1 d d d d d Rdd=vavguw(Rss,Rtt)
1 1 0 1 0 0 1 1 0 1 1 s s s s s P P - t t t t t 1 0 0 d d d d d Rdd=vavguw(Rss,Rtt):rnd
1 1 0 1 0 0 1 1 1 0 0 s s s s s P P - t t t t t 0 1 1 d d d d d Rdd=vnavgw(Rtt,Rss)
1 1 0 1 0 0 1 1 1 0 0 s s s s s P P - t t t t t 1 0 - d d d d d Rdd=vnavgw(Rtt,Rss):rnd:s
at
1 1 0 1 0 0 1 1 1 0 0 s s s s s P P - t t t t t 1 1 - d d d d d Rdd=vnavgw(Rtt,Rss):crnd:
sat
Notes
■ This instruction can only execute on a core with the Hexagon audio extensions
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 1 0 0 0 1 1 0 s s s s s P P 0 i i i i i 1 1 0 d d d d d Rdd=vclip(Rss,#u5)
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType Maj s5 Parse t5 Min d5
1 1 0 0 0 0 1 1 1 1 - s s s s s P P - t t t t t 0 1 - d d d d d Rdd=vcnegh(Rss,Rt)
ICLASS RegType Maj s5 Parse t5 Min x5
1 1 0 0 1 0 1 1 0 0 1 s s s s s P P 1 t t t t t 1 1 1 x x x x x Rxx+=vrcnegh(Rss,Rt)
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 MinOp d5
1 1 0 1 0 0 1 1 1 1 0 s s s s s P P - t t t t t 0 0 0 d d d d d Rdd=vmaxub(Rtt,Rss)
1 1 0 1 0 0 1 1 1 1 0 s s s s s P P - t t t t t 1 1 0 d d d d d Rdd=vmaxb(Rtt,Rss)
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 MinOp d5
1 1 0 1 0 0 1 1 1 1 0 s s s s s P P - t t t t t 0 0 1 d d d d d Rdd=vmaxh(Rtt,Rss)
1 1 0 1 0 0 1 1 1 1 0 s s s s s P P - t t t t t 0 1 0 d d d d d Rdd=vmaxuh(Rtt,Rss)
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType Maj s5 Parse x5 Min u5
1 1 0 0 1 0 1 1 0 0 1 s s s s s P P 0 x x x x x 0 0 1 u u u u u Rxx=vrmaxh(Rss,Ru)
1 1 0 0 1 0 1 1 0 0 1 s s s s s P P 1 x x x x x 0 0 1 u u u u u Rxx=vrmaxuh(Rss,Ru)
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType Maj s5 Parse x5 Min u5
1 1 0 0 1 0 1 1 0 0 1 s s s s s P P 0 x x x x x 0 1 0 u u u u u Rxx=vrmaxw(Rss,Ru)
1 1 0 0 1 0 1 1 0 0 1 s s s s s P P 1 x x x x x 0 1 0 u u u u u Rxx=vrmaxuw(Rss,Ru)
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 MinOp d5
1 1 0 1 0 0 1 1 1 0 1 s s s s s P P - t t t t t 1 0 1 d d d d d Rdd=vmaxuw(Rtt,Rss)
1 1 0 1 0 0 1 1 1 1 0 s s s s s P P - t t t t t 0 1 1 d d d d d Rdd=vmaxw(Rtt,Rss)
Notes
■ The predicate generated by this instruction cannot be used as a .new predicate, nor can it be
automatically ANDed with another predicate.
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 MinOp d5
1 1 0 1 0 0 1 1 1 0 1 s s s s s P P - t t t t t 0 0 0 d d d d d Rdd=vminub(Rtt,Rss)
1 1 0 1 0 0 1 1 1 1 0 s s s s s P P - t t t t t 1 1 1 d d d d d Rdd=vminb(Rtt,Rss)
ICLASS RegType MajOp s5 Parse t5 e2 d5
1 1 1 0 1 0 1 0 1 1 1 s s s s s P P 0 t t t t t 0 e e d d d d d Rdd,Pe=vminub(Rtt,Rss)
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 MinOp d5
1 1 0 1 0 0 1 1 1 0 1 s s s s s P P - t t t t t 0 0 1 d d d d d Rdd=vminh(Rtt,Rss)
1 1 0 1 0 0 1 1 1 0 1 s s s s s P P - t t t t t 0 1 0 d d d d d Rdd=vminuh(Rtt,Rss)
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType Maj s5 Parse x5 Min u5
1 1 0 0 1 0 1 1 0 0 1 s s s s s P P 0 x x x x x 1 0 1 u u u u u Rxx=vrminh(Rss,Ru)
1 1 0 0 1 0 1 1 0 0 1 s s s s s P P 1 x x x x x 1 0 1 u u u u u Rxx=vrminuh(Rss,Ru)
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType Maj s5 Parse x5 Min u5
1 1 0 0 1 0 1 1 0 0 1 s s s s s P P 0 x x x x x 1 1 0 u u u u u Rxx=vrminw(Rss,Ru)
1 1 0 0 1 0 1 1 0 0 1 s s s s s P P 1 x x x x x 1 1 0 u u u u u Rxx=vrminuw(Rss,Ru)
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 MinOp d5
1 1 0 1 0 0 1 1 1 0 1 s s s s s P P - t t t t t 0 1 1 d d d d d Rdd=vminw(Rtt,Rss)
1 1 0 1 0 0 1 1 1 0 1 s s s s s P P - t t t t t 1 0 0 d d d d d Rdd=vminuw(Rtt,Rss)
Rtt
Rdd
Syntax Behavior
Rdd=vrsadub(Rss,Rtt) Rdd = 0;
for (i = 0; i < 4; i++) {
Rdd.w[0]=(Rdd.w[0] + ABS((Rss.ub[i] -
Rtt.ub[i])));
}
for (i = 4; i < 8; i++) {
Rdd.w[1]=(Rdd.w[1] + ABS((Rss.ub[i] -
Rtt.ub[i])));
}
Rxx+=vrsadub(Rss,Rtt for (i = 0; i < 4; i++) {
) Rxx.w[0]=(Rxx.w[0] + ABS((Rss.ub[i] -
Rtt.ub[i])));
}
for (i = 4; i < 8; i++) {
Rxx.w[1]=(Rxx.w[1] + ABS((Rss.ub[i] -
Rtt.ub[i])));
}
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
1 1 1 0 1 0 0 0 0 1 0 s s s s s P P 0 t t t t t 0 1 0 d d d d d Rdd=vrsadub(Rss,Rtt)
ICLASS RegType MajOp s5 Parse t5 MinOp x5
1 1 1 0 1 0 1 0 0 1 0 s s s s s P P 0 t t t t t 0 1 0 x x x x x Rxx+=vrsadub(Rss,Rtt)
Syntax Behavior
Rdd=vsubh(Rtt,Rss)[:sat] for (i=0;i<4;i++) {
Rdd.h[i]=[sat16](Rtt.h[i]-Rss.h[i]);
}
Rdd=vsubuh(Rtt,Rss):sat for (i=0;i<4;i++) {
Rdd.h[i]=usat16(Rtt.uh[i]-Rss.uh[i]);
}
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 MinOp d5
1 1 0 1 0 0 1 1 0 0 1 s s s s s P P - t t t t t 0 1 0 d d d d d Rdd=vsubh(Rtt,Rss)
1 1 0 1 0 0 1 1 0 0 1 s s s s s P P - t t t t t 0 1 1 d d d d d Rdd=vsubh(Rtt,Rss):sat
1 1 0 1 0 0 1 1 0 0 1 s s s s s P P - t t t t t 1 0 0 d d d d d Rdd=vsubuh(Rtt,Rss):sat
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 MinOp d5
1 1 0 1 0 0 1 1 0 0 1 s s s s s P P - t t t t t 0 0 0 d d d d d Rdd=vsubub(Rtt,Rss)
1 1 0 1 0 0 1 1 0 0 1 s s s s s P P - t t t t t 0 0 1 d d d d d Rdd=vsubub(Rtt,Rss):sat
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 MinOp d5
1 1 0 1 0 0 1 1 0 0 1 s s s s s P P - t t t t t 1 0 1 d d d d d Rdd=vsubw(Rtt,Rss)
1 1 0 1 0 0 1 1 0 0 1 s s s s s P P - t t t t t 1 1 0 d d d d d Rdd=vsubw(Rtt,Rss):sat
Count leading
Count leading zeros (cl0) counts the number of consecutive zeros starting with the most
significant bit.
Count leading ones (cl1) counts the number of consecutive ones starting with the most significant
bit.
Count leading bits (clb) counts both leading ones and leading zeros and then selects the
maximum.
The normamt instruction returns the number of leading bits minus one.
For a two's-complement number, the number of leading zeros is zero for negative numbers. The
number of leading ones is zero for positive numbers.
The number of leading bits can be used to judge the magnitude of the value.
Syntax Behavior
Rd=add(clb(Rs),#s6 Rd =
) (max(count_leading_ones(Rs),count_leading_ones(~Rs)))+#s;
Rd=add(clb(Rss),#s Rd =
6) (max(count_leading_ones(Rss),count_leading_ones(~Rss)))+#s;
Rd=cl0(Rs) Rd = count_leading_ones(~Rs);
Rd=cl0(Rss) Rd = count_leading_ones(~Rss);
Rd=cl1(Rs) Rd = count_leading_ones(Rs);
Rd=cl1(Rss) Rd = count_leading_ones(Rss);
Rd=clb(Rs) Rd = max(count_leading_ones(Rs),count_leading_ones(~Rs));
Rd=clb(Rss) Rd = max(count_leading_ones(Rss),count_leading_ones(~Rss));
Rd=normamt(Rs) if (Rs == 0) {
Rd = 0;
} else {
Rd =
(max(count_leading_ones(Rs),count_leading_ones(~Rs)))-1;
}
Rd=normamt(Rss) if (Rss == 0) {
Rd = 0;
} else {
Rd =
(max(count_leading_ones(Rss),count_leading_ones(~Rss)))-1;
}
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 1 0 0 0 0 1 0 s s s s s P P - - - - - - 0 0 0 d d d d d Rd=clb(Rss)
1 0 0 0 1 0 0 0 0 1 0 s s s s s P P - - - - - - 0 1 0 d d d d d Rd=cl0(Rss)
1 0 0 0 1 0 0 0 0 1 0 s s s s s P P - - - - - - 1 0 0 d d d d d Rd=cl1(Rss)
1 0 0 0 1 0 0 0 0 1 1 s s s s s P P - - - - - - 0 0 0 d d d d d Rd=normamt(Rss)
1 0 0 0 1 0 0 0 0 1 1 s s s s s P P i i i i i i 0 1 0 d d d d d Rd=add(clb(Rss),#s6)
1 0 0 0 1 1 0 0 0 0 1 s s s s s P P i i i i i i 0 0 0 d d d d d Rd=add(clb(Rs),#s6)
1 0 0 0 1 1 0 0 0 0 0 s s s s s P P - - - - - - 1 0 0 d d d d d Rd=clb(Rs)
1 0 0 0 1 1 0 0 0 0 0 s s s s s P P - - - - - - 1 0 1 d d d d d Rd=cl0(Rs)
1 0 0 0 1 1 0 0 0 0 0 s s s s s P P - - - - - - 1 1 0 d d d d d Rd=cl1(Rs)
1 0 0 0 1 1 0 0 0 0 0 s s s s s P P - - - - - - 1 1 1 d d d d d Rd=normamt(Rs)
Count population
The population count (popcount) instruction counts the number set bits in Rss.
Syntax Behavior
Rd=popcount(Rss) Rd = count_ones(Rss);
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 1 0 0 0 0 1 1 s s s s s P P - - - - - - 0 1 1 d d d d d Rd=popcount(Rss)
Count trailing
Count trailing zeros (ct0) counts the number of consecutive zeros starting with the least
significant bit.
Count trailing ones (ct1) counts the number of consecutive ones starting with the least significant
bit.
Syntax Behavior
Rd=ct0(Rs) Rd = count_leading_ones(~reverse_bits(Rs));
Rd=ct0(Rss) Rd =
count_leading_ones(~reverse_bits(Rss));
Rd=ct1(Rs) Rd = count_leading_ones(reverse_bits(Rs));
Rd=ct1(Rss) Rd = count_leading_ones(reverse_bits(Rss));
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 1 0 0 0 1 1 1 s s s s s P P - - - - - - 0 1 0 d d d d d Rd=ct0(Rss)
1 0 0 0 1 0 0 0 1 1 1 s s s s s P P - - - - - - 1 0 0 d d d d d Rd=ct1(Rss)
1 0 0 0 1 1 0 0 0 1 0 s s s s s P P - - - - - - 1 0 0 d d d d d Rd=ct0(Rs)
1 0 0 0 1 1 0 0 0 1 0 s s s s s P P - - - - - - 1 0 1 d d d d d Rd=ct1(Rs)
Width Offset
Rs
Rd
Zero Extension
Syntax Behavior
Rd=extract(Rs,#u5,#U5) width=#u;
offset=#U;
Rd = sxtwidth->32((Rs >> offset));
Rd=extract(Rs,Rtt) width=zxt6->32((Rtt.w[1]));
offset=sxt7->32((Rtt.w[0]));
Rd = sxtwidth->64((offset>0)?(zxt32->64(zxt32->64(Rs))>>>
offset): (zxt32->64(zxt32->64(Rs))<<offset));
Rd=extractu(Rs,#u5,#U5 width=#u;
) offset=#U;
Rd = zxtwidth->32((Rs >> offset));
Rd=extractu(Rs,Rtt) width=zxt6->32((Rtt.w[1]));
offset=sxt7->32((Rtt.w[0]));
Rd = zxtwidth->64((offset>0)?(zxt32->64(zxt32->64(Rs))>>>
offset): (zxt32->64(zxt32->64(Rs))<<offset));
Rdd=extract(Rss,#u6,#U width=#u;
6) offset=#U;
Rdd = sxtwidth->64((Rss >> offset));
Syntax Behavior
Rdd=extract(Rss,Rtt) width=zxt6->32((Rtt.w[1]));
offset=sxt7->32((Rtt.w[0]));
Rdd = sxtwidth-
>64((offset>0)?(Rss>>>offset):(Rss<<offset));
Rdd=extractu(Rss,#u6,# width=#u;
U6) offset=#U;
Rdd = zxtwidth->64((Rss >> offset));
Rdd=extractu(Rss,Rtt) width=zxt6->32((Rtt.w[1]));
offset=sxt7->32((Rtt.w[0]));
Rdd = zxtwidth-
>64((offset>0)?(Rss>>>offset):(Rss<<offset));
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
Rdd=extractu(Rss,#u6,#U6
1 0 0 0 0 0 0 1 I I I s s s s s P P i i i i i i I I I d d d d d )
1 0 0 0 1 0 1 0 I I I s s s s s P P i i i i i i I I I d d d d d Rdd=extract(Rss,#u6,#U6)
1 0 0 0 1 1 0 1 0 I I s s s s s P P 0 i i i i i I I I d d d d d Rd=extractu(Rs,#u5,#U5)
1 0 0 0 1 1 0 1 1 I I s s s s s P P 0 i i i i i I I I d d d d d Rd=extract(Rs,#u5,#U5)
ICLASS RegType Maj s5 Parse t5 Min d5
1 1 0 0 0 0 0 1 0 0 - s s s s s P P - t t t t t 0 0 - d d d d d Rdd=extractu(Rss,Rtt)
1 1 0 0 0 0 0 1 1 1 - s s s s s P P - t t t t t 1 0 - d d d d d Rdd=extract(Rss,Rtt)
1 1 0 0 1 0 0 1 0 0 - s s s s s P P - t t t t t 0 0 - d d d d d Rd=extractu(Rs,Rtt)
1 1 0 0 1 0 0 1 0 0 - s s s s s P P - t t t t t 0 1 - d d d d d Rd=extract(Rs,Rtt)
Width
Rs
Offset
Rd
Unchanged Unchanged
Syntax Behavior
Rx=insert(Rs,#u5,#U5) width=#u;
offset=#U;
Rx &= ~(((1<<width)-1)<<offset);
Rx |= ((Rs & ((1<<width)-1)) << offset);
Rx=insert(Rs,Rtt) width=zxt6->32((Rtt.w[1]));
offset=sxt7->32((Rtt.w[0]));
mask = ((1<<width)-1);
if (offset < 0) {
Rx = 0;
} else {
Rx &= ~(mask<<offset);
Rx |= ((Rs & mask) << offset);
}
Rxx=insert(Rss,#u6,#U6 width=#u;
) offset=#U;
Rxx &= ~(((1<<width)-1)<<offset);
Rxx |= ((Rss & ((1<<width)-1)) << offset);
Syntax Behavior
Rxx=insert(Rss,Rtt) width=zxt6->32((Rtt.w[1]));
offset=sxt7->32((Rtt.w[0]));
mask = ((1<<width)-1);
if (offset < 0) {
Rxx = 0;
} else {
Rxx &= ~(mask<<offset);
Rxx |= ((Rss & mask) << offset);
}
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp x5
1 0 0 0 0 0 1 1 I I I s s s s s P P i i i i i i I I I x x x x x Rxx=insert(Rss,#u6,#U6)
1 0 0 0 1 1 1 1 0 I I s s s s s P P 0 i i i i i I I I x x x x x Rx=insert(Rs,#u5,#U5)
ICLASS RegType s5 Parse t5 x5
1 1 0 0 1 0 0 0 - - - s s s s s P P - t t t t t - - - x x x x x Rx=insert(Rs,Rtt)
ICLASS RegType Maj s5 Parse t5 x5
1 1 0 0 1 0 1 0 0 - - s s s s s P P 0 t t t t t - - - x x x x x Rxx=insert(Rss,Rtt)
Interleave/deinterleave
For interleave, bits I+32 of Rss (which are the bits from the upper source word) are placed in the
odd bits (I*2)+1 of Rdd, while bits I of Rss (which are the bits from the lower source word) are
placed in the even bits (I*2) of Rdd.
For deinterleave, the even bits of the source register are placed in the even register of the result
pair, and the odd bits of the source register are placed in the odd register of the result pair.
"r1:0 = deinterleave(r1:0)" is the inverse of "r1:0 = interleave(r1:0)".
Syntax Behavior
Rdd=deinterleave(Rss) Rdd = deinterleave(ODD,EVEN);
Rdd=interleave(Rss) Rdd = interleave(Rss.w[1],Rss.w[0]);
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 0 0 0 0 1 1 0 s s s s s P P - - - - - - 1 0 0 d d d d d Rdd=deinterleave(Rss)
1 0 0 0 0 0 0 0 1 1 0 s s s s s P P - - - - - - 1 0 1 d d d d d Rdd=interleave(Rss)
Syntax Behavior
Rdd=lfs(Rss,Rtt) Rdd = (Rss.u64 >> 1) | ((1&count_ones(Rss &
Rtt)).u64<<63);
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType Maj s5 Parse t5 Min d5
1 1 0 0 0 0 0 1 1 0 - s s s s s P P - t t t t t 1 1 0 d d d d d Rdd=lfs(Rss,Rtt)
Masked parity
Count the number of ones of the logical AND of the two source input values, and take the least
significant bit of that sum.
Syntax Behavior
Rd=parity(Rs,Rt) Rd = 1&count_ones(Rs & Rt);
Rd=parity(Rss,Rtt) Rd = 1&count_ones(Rss & Rtt);
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 d5
1 1 0 1 0 0 0 0 - - - s s s s s P P - t t t t t - - - d d d d d Rd=parity(Rss,Rtt)
1 1 0 1 0 1 0 1 1 1 1 s s s s s P P - t t t t t - - - d d d d d Rd=parity(Rs,Rt)
Bit reverse
Reverse the order of bits. The most significant swap with the least significant, bit 30 swaps with
bit 1, and so on.
Syntax Behavior
Rd=brev(Rs) Rd = reverse_bits(Rs);
Rdd=brev(Rss) Rdd = reverse_bits(Rss);
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 0 0 0 0 1 1 0 s s s s s P P - - - - - - 1 1 0 d d d d d Rdd=brev(Rss)
1 0 0 0 1 1 0 0 0 1 0 s s s s s P P - - - - - - 1 1 0 d d d d d Rd=brev(Rs)
Set/clear/toggle bit
Set (to 1), clear (to 0), or toggle a single bit in the source, and place the resulting value in the
destination. Indicate the bit to manipulate using an immediate or register value.
If a register is used to indicate the bit position, and the value of the least-significant 7 bits of Rt is
out of range, the destination register is unchanged.
Syntax Behavior
Rd=clrbit(Rs,#u5) Rd = (Rs & (~(1<<#u)));
Rd=clrbit(Rs,Rt) Rd = (Rs & (~((sxt7->32(Rt)>0)?(zxt32->64(1)<<sxt7-
>32(Rt)):(zxt32->64(1)>>>sxt7->32(Rt)))));
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 1 1 0 0 1 1 0 s s s s s P P 0 i i i i i 0 0 0 d d d d d Rd=setbit(Rs,#u5)
1 0 0 0 1 1 0 0 1 1 0 s s s s s P P 0 i i i i i 0 0 1 d d d d d Rd=clrbit(Rs,#u5)
1 0 0 0 1 1 0 0 1 1 0 s s s s s P P 0 i i i i i 0 1 0 d d d d d Rd=togglebit(Rs,#u5)
ICLASS RegType Maj s5 Parse t5 Min d5
1 1 0 0 0 1 1 0 1 0 - s s s s s P P - t t t t t 0 0 - d d d d d Rd=setbit(Rs,Rt)
1 1 0 0 0 1 1 0 1 0 - s s s s s P P - t t t t t 0 1 - d d d d d Rd=clrbit(Rs,Rt)
1 1 0 0 0 1 1 0 1 0 - s s s s s P P - t t t t t 1 0 - d d d d d Rd=togglebit(Rs,Rt)
Bits
Rs
Rdd[0]
Zero
Rdd[1]
Zero
Syntax Behavior
Rdd=bitsplit(Rs,#u5) Rdd.w[1]=(Rs>>#u);
Rdd.w[0]=zxt#u->32(Rs);
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 1 0 0 0 1 1 0 s s s s s P P 0 i i i i i 1 0 0 d d d d d Rdd=bitsplit(Rs,#u5)
ICLASS RegType s5 Parse t5 d5
1 1 0 1 0 1 0 0 - - 1 s s s s s P P - t t t t t - - - d d d d d Rdd=bitsplit(Rs,Rt)
Table index
The table index instruction supports fast lookup tables where the index into the table is stored in
a bit-field. The instruction forms the address of a table element by extracting the bit field and
inserting it into the appropriate bits of a pointer to the table element.
Tables are defined to contain entries of bytes, halfwords, words, or doublewords. The table must
align to a power-of-two size greater than or equal to the table size. For example, a 4 K byte table
should align to a 4 K byte boundary. This instruction supports tables with a maximum of 32 K
table entries.
Register Rx contains a pointer to within the table. Register Rs contains a field to extract and use as
a table index. This instruction first extracts the field from register Rs and then inserts it into
register Rx. The insertion point is bit 0 for tables of bytes, bit 1 for tables of halfwords, bit 2 for
tables of words, and bit 3 for tables of doublewords.
In the assembly syntax, the width and offset values represent the field in Rs to extract. Use
unsigned constants to specify the width and offsets in assembly. In the encoded instruction,
however, the assembler adjusts these values as follows.
■ For tableidxb, no adjustment is necessary.
■ For tableidxh, the assembler encodes offset-1 in the signed immediate field.
■ For tableidxw, the assembler encodes offset-2 in the signed immediate field.
■ For tableidxd, the assembler encodes offset-3 in the signed immediate field.
Rx=TABLEIDXD(Rs,#width,#offset)
Width Offset
Rs
Unchanged Rx
Unchanged
Syntax Behavior
Rx=tableidxb(Rs,#u4,#S6):raw width=#u;
offset=#S;
field = Rs[(width+offset-1):offset];
Rx[(width-1+0):0]=field;
Rx=tableidxb(Rs,#u4,#U5) Assembler mapped to:
"Rx=tableidxb(Rs,#u4,#U5):raw"
Rx=tableidxd(Rs,#u4,#S6):raw width=#u;
offset=#S+3;
field = Rs[(width+offset-1):offset];
Rx[(width-1+3):3]=field;
Syntax Behavior
Rx=tableidxd(Rs,#u4,#U5) Assembler mapped to: "Rx=tableidxd(Rs,#u4,#U5-
3):raw"
Rx=tableidxh(Rs,#u4,#S6):raw width=#u;
offset=#S+1;
field = Rs[(width+offset-1):offset];
Rx[(width-1+1):1]=field;
Rx=tableidxh(Rs,#u4,#U5) Assembler mapped to: "Rx=tableidxh(Rs,#u4,#U5-
1):raw"
Rx=tableidxw(Rs,#u4,#S6):raw width=#u;
offset=#S+2;
field = Rs[(width+offset-1):offset];
Rx[(width-1+2):2]=field;
Rx=tableidxw(Rs,#u4,#U5) Assembler mapped to: "Rx=tableidxw(Rs,#u4,#U5-
2):raw"
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp x5
1 0 0 0 0 1 1 1 0 0 i s s s s s P P I I I I I I i i i x x x x x Rx=tableidxb(Rs,#u4,#S6):r
aw
1 0 0 0 0 1 1 1 0 1 i s s s s s P P I I I I I I i i i x x x x x Rx=tableidxh(Rs,#u4,#S6):r
aw
Rx=tableidxw(Rs,#u4,#S6):
1 0 0 0 0 1 1 1 1 0 i s s s s s P P I I I I I I i i i x x x x x raw
1 0 0 0 0 1 1 1 1 1 i s s s s s P P I I I I I I i i i x x x x x Rx=tableidxd(Rs,#u4,#S6):r
aw
Rdd=vxaddsubh(Rss,Rtt):sat
I R I R Rss
I R I R Rtt
- -
+ + + +
I R I R Rdd
Rdd=vxsubaddh(Rss,Rt):rnd:>>1:sat
I R I R Rss
I R I R Rtt
- -
1 + 1 + 1 + 1 +
Rdd
Sat_16 Sat_16 Sat_16 Sat_16
I R I R
Syntax Behavior
Rdd=vxaddsubh(Rss,Rtt):rnd:>>1 Rdd.h[0]=sat16((Rss.h[0]+Rtt.h[1]+1)>>1)
:sat ;
Rdd.h[1]=sat16((Rss.h[1]-
Rtt.h[0]+1)>>1);
Rdd.h[2]=sat16((Rss.h[2]+Rtt.h[3]+1)>>1)
;
Rdd.h[3]=sat16((Rss.h[3]-
Rtt.h[2]+1)>>1);
Rdd=vxaddsubh(Rss,Rtt):sat Rdd.h[0]=sat16(Rss.h[0]+Rtt.h[1]);
Rdd.h[1]=sat16(Rss.h[1]-Rtt.h[0]);
Rdd.h[2]=sat16(Rss.h[2]+Rtt.h[3]);
Rdd.h[3]=sat16(Rss.h[3]-Rtt.h[2]);
Rdd=vxsubaddh(Rss,Rtt):rnd:>>1 Rdd.h[0]=sat16((Rss.h[0]-
:sat Rtt.h[1]+1)>>1);
Rdd.h[1]=sat16((Rss.h[1]+Rtt.h[0]+1)>>1)
;
Rdd.h[2]=sat16((Rss.h[2]-
Rtt.h[3]+1)>>1);
Rdd.h[3]=sat16((Rss.h[3]+Rtt.h[2]+1)>>1)
;
Rdd=vxsubaddh(Rss,Rtt):sat Rdd.h[0]=sat16(Rss.h[0]-Rtt.h[1]);
Rdd.h[1]=sat16(Rss.h[1]+Rtt.h[0]);
Rdd.h[2]=sat16(Rss.h[2]-Rtt.h[3]);
Rdd.h[3]=sat16(Rss.h[3]+Rtt.h[2]);
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType Maj s5 Parse t5 Min d5
1 1 0 0 0 0 0 1 0 1 - s s s s s P P - t t t t t 1 0 0 d d d d d Rdd=vxaddsubh(Rss,Rtt):s
at
Rdd=vxsubaddh(Rss,Rtt):s
1 1 0 0 0 0 0 1 0 1 - s s s s s P P - t t t t t 1 1 0 d d d d d at
1 1 0 0 0 0 0 1 1 1 - s s s s s P P - t t t t t 0 0 - d d d d d Rdd=vxaddsubh(Rss,Rtt):r
nd:>>1:sat
1 1 0 0 0 0 0 1 1 1 - s s s s s P P - t t t t t 0 1 - d d d d d Rdd=vxsubaddh(Rss,Rtt):r
nd:>>1:sat
Rdd=vxaddsubw(Rss,Rt):sat Rdd=vxsubaddw(Rss,Rt):sat
I R Rss I R Rss
I R Rtt I R Rtt
- + + -
I R Rdd I R Rdd
Syntax Behavior
Rdd=vxaddsubw(Rss,Rtt):sat Rdd.w[0]=sat32(Rss.w[0]+Rtt.w[1]);
Rdd.w[1]=sat32(Rss.w[1]-Rtt.w[0]);
Rdd=vxsubaddw(Rss,Rtt):sat Rdd.w[0]=sat32(Rss.w[0]-Rtt.w[1]);
Rdd.w[1]=sat32(Rss.w[1]+Rtt.w[0]);
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType Maj s5 Parse t5 Min d5
Rdd=vxaddsubw(Rss,Rtt):s
1 1 0 0 0 0 0 1 0 1 - s s s s s P P - t t t t t 0 0 0 d d d d d at
1 1 0 0 0 0 0 1 0 1 - s s s s s P P - t t t t t 0 1 0 d d d d d Rdd=vxsubaddw(Rss,Rtt):s
at
Complex multiply
Multiply complex values Rs and Rt. The inputs have a real 16-bit value in the low halfword and an
imaginary 16-bit value in the high halfword. Optionally, scale the result by 0-1 bits. Optionally, add
a complex accumulator. Saturate the real and imaginary portions to 32-bits. The output has a real
32-bit value in the low word and an imaginary 32-bit value in the high word. The Rt input can be
optionally conjugated. Another option is to subtracted the result from the destination rather than
accumulate it.
Rxx+=cmpy(Rs,Rt):sat
Rs I R I R Rs
Rt I R I R Rt
* * * *
32 32 32 32
-
Add Add
Sat_32 Sat_32 32
32
Rxx
Syntax Behavior
Rdd=cmpy(Rs,Rt)[:<<1]:sat Rdd.w[1]=sat32((Rs.h[1] * Rt.h[0])[<<1] + (Rs.h[0] *
Rt.h[1])[<<1]);
Rdd.w[0]=sat32((Rs.h[0] * Rt.h[0])[<<1] - (Rs.h[1] *
Rt.h[1])[<<1]);
Rdd=cmpy(Rs,Rt*)[:<<1]:sa Rdd.w[1]=sat32((Rs.h[1] * Rt.h[0])[<<1] - (Rs.h[0] *
t Rt.h[1])[<<1]);
Rdd.w[0]=sat32((Rs.h[0] * Rt.h[0])[<<1] + (Rs.h[1] *
Rt.h[1])[<<1]);
Syntax Behavior
Rxx+=cmpy(Rs,Rt)[:<<1]:sa Rxx.w[1]=sat32(Rxx.w[1] + (Rs.h[1] * Rt.h[0])[<<1] +
t (Rs.h[0] * Rt.h[1])[<<1]);
Rxx.w[0]=sat32(Rxx.w[0] + (Rs.h[0] * Rt.h[0])[<<1] -
(Rs.h[1] * Rt.h[1])[<<1]);
Rxx+=cmpy(Rs,Rt*)[:<<1]:s Rxx.w[1]=sat32(Rxx.w[1] + (Rs.h[1] * Rt.h[0])[<<1] -
at (Rs.h[0] * Rt.h[1])[<<1]);
Rxx.w[0]=sat32(Rxx.w[0] + (Rs.h[0] * Rt.h[0])[<<1] +
(Rs.h[1] * Rt.h[1])[<<1]);
Rxx- Rxx.w[1]=sat32(Rxx.w[1] - ((Rs.h[1] * Rt.h[0])[<<1] +
=cmpy(Rs,Rt)[:<<1]:sat (Rs.h[0] * Rt.h[1])[<<1]));
Rxx.w[0]=sat32(Rxx.w[0] - ((Rs.h[0] * Rt.h[0])[<<1] -
(Rs.h[1] * Rt.h[1])[<<1]));
Rxx- Rxx.w[1]=sat32(Rxx.w[1] - ((Rs.h[1] * Rt.h[0])[<<1] -
=cmpy(Rs,Rt*)[:<<1]:sat (Rs.h[0] * Rt.h[1])[<<1]));
Rxx.w[0]=sat32(Rxx.w[0] - ((Rs.h[0] * Rt.h[0])[<<1] +
(Rs.h[1] * Rt.h[1])[<<1]));
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
1 1 1 0 0 1 0 1 N 0 0 s s s s s P P 0 t t t t t 1 1 0 d d d d d Rdd=cmpy(Rs,Rt)[:<<N]:sat
1 1 1 0 0 1 0 1 N 1 0 s s s s s P P 0 t t t t t 1 1 0 d d d d d Rdd=cmpy(Rs,Rt*)[:<<N]:s
at
ICLASS RegType MajOp s5 Parse t5 MinOp x5
1 1 1 0 0 1 1 1 N 0 0 s s s s s P P 0 t t t t t 1 1 0 x x x x x Rxx+=cmpy(Rs,Rt)[:<<N]:s
at
Rxx-
1 1 1 0 0 1 1 1 N 0 0 s s s s s P P 0 t t t t t 1 1 1 x x x x x =cmpy(Rs,Rt)[:<<N]:sat
1 1 1 0 0 1 1 1 N 1 0 s s s s s P P 0 t t t t t 1 1 0 x x x x x Rxx+=cmpy(Rs,Rt*)[:<<N]:
sat
1 1 1 0 0 1 1 1 N 1 0 s s s s s P P 0 t t t t t 1 1 1 x x x x x Rxx-
=cmpy(Rs,Rt*)[:<<N]:sat
Rxx+=cmpyi(Rs,Rt)
I R Rs
I R Rt
* *
32
32
Add
64
Rxx
Imaginary Accumulation
Syntax Behavior
Rdd=cmpyi(Rs,Rt) Rdd = (Rs.h[1] * Rt.h[0]) + (Rs.h[0] * Rt.h[1]);
Rdd=cmpyr(Rs,Rt) Rdd = (Rs.h[0] * Rt.h[0]) - (Rs.h[1] * Rt.h[1]);
Rxx+=cmpyi(Rs,Rt) Rxx = Rxx + (Rs.h[1] * Rt.h[0]) + (Rs.h[0] *
Rt.h[1]);
Rxx+=cmpyr(Rs,Rt) Rxx = Rxx + (Rs.h[0] * Rt.h[0]) - (Rs.h[1] *
Rt.h[1]);
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
1 1 1 0 0 1 0 1 0 0 0 s s s s s P P 0 t t t t t 0 0 1 d d d d d Rdd=cmpyi(Rs,Rt)
1 1 1 0 0 1 0 1 0 0 0 s s s s s P P 0 t t t t t 0 1 0 d d d d d Rdd=cmpyr(Rs,Rt)
ICLASS RegType MajOp s5 Parse t5 MinOp x5
1 1 1 0 0 1 1 1 0 0 0 s s s s s P P 0 t t t t t 0 0 1 x x x x x Rxx+=cmpyi(Rs,Rt)
1 1 1 0 0 1 1 1 0 0 0 s s s s s P P 0 t t t t t 0 1 0 x x x x x Rxx+=cmpyr(Rs,Rt)
Rd=cmpy(Rs,Rt):rnd:sat
Rs I R I R Rs
Rt I R I R Rt
* * * *
32 32 32 32
Sat_32 Sat_32
I R Rd
Syntax Behavior
Rd=cmpy(Rs,Rt)[:<<1]:rnd:s Rd.h[1]=(sat32((Rs.h[1] * Rt.h[0])[<<1] + (Rs.h[0]
at * Rt.h[1])[<<1] + 0x8000)).h[1];
Rd.h[0]=(sat32((Rs.h[0] * Rt.h[0])[<<1] - (Rs.h[1]
* Rt.h[1])[<<1] + 0x8000)).h[1];
Rd=cmpy(Rs,Rt*)[:<<1]:rnd: Rd.h[1]=(sat32((Rs.h[1] * Rt.h[0])[<<1] - (Rs.h[0]
sat * Rt.h[1])[<<1] + 0x8000)).h[1];
Rd.h[0]=(sat32((Rs.h[0] * Rt.h[0])[<<1] + (Rs.h[1]
* Rt.h[1])[<<1] + 0x8000)).h[1];
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
1 1 1 0 1 1 0 1 N 0 1 s s s s s P P 0 t t t t t 1 1 0 d d d d d Rd=cmpy(Rs,Rt)[:<<N]:rnd:
sat
Rd=cmpy(Rs,Rt*)[:<<N]:rnd
1 1 1 0 1 1 0 1 N 1 1 s s s s s P P 0 t t t t t 1 1 0 d d d d d :sat
Complex multiply 32 × 16
Multiply 32 by 16 bit complex values Rss and Rt. The inputs have a real value in the low part of a
register and the imaginary value in the upper part. The multiplier results are scaled by 1 bit and
accumulated with a rounding constant. The result is saturated to 32 bits.
Rd=cmpyrwh(Rss,Rt):<<1:rnd:sat Rd=cmpyiwh(Rss,Rt):<<1:rnd:sat
I R Rss I R Rss
I R Rt I R Rt
* * * 48
*
48
48 48
0x8000 0x8000
<<1 <<1 <<1 <<1
-
Add Add
Sat_32 Sat_32
Syntax Behavior
Rd=cmpyiwh(Rss,Rt):<<1:rnd:s Rd = sat32(( (Rss.w[0] * Rt.h[1]) + (Rss.w[1] *
at Rt.h[0]) + 0x4000)>>15);
Rd=cmpyiwh(Rss,Rt*):<<1:rnd: Rd = sat32(( (Rss.w[1] * Rt.h[0]) - (Rss.w[0] *
sat Rt.h[1]) + 0x4000)>>15);
Rd=cmpyrwh(Rss,Rt):<<1:rnd:s Rd = sat32(( (Rss.w[0] * Rt.h[0]) - (Rss.w[1] *
at Rt.h[1]) + 0x4000)>>15);
Rd=cmpyrwh(Rss,Rt*):<<1:rnd: Rd = sat32(( (Rss.w[0] * Rt.h[0]) + (Rss.w[1] *
sat Rt.h[1]) + 0x4000)>>15);
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 Min d5
1 1 0 0 0 1 0 1 - - - s s s s s P P - t t t t t 1 0 0 d d d d d Rd=cmpyiwh(Rss,Rt):<<1:r
nd:sat
Rd=cmpyiwh(Rss,Rt*):<<1:
1 1 0 0 0 1 0 1 - - - s s s s s P P - t t t t t 1 0 1 d d d d d rnd:sat
1 1 0 0 0 1 0 1 - - - s s s s s P P - t t t t t 1 1 0 d d d d d Rd=cmpyrwh(Rss,Rt):<<1:r
nd:sat
1 1 0 0 0 1 0 1 - - - s s s s s P P - t t t t t 1 1 1 d d d d d Rd=cmpyrwh(Rss,Rt*):<<1:
rnd:sat
Syntax Behavior
Rd=cmpyiw(Rss,Rtt):<<1:rnd:s tmp128 = sxt64->128((Rss.w[0] * Rtt.w[1]));
at acc128 = sxt64->128((Rss.w[1] * Rtt.w[0]));
const128 = sxt64->128(0x40000000);
acc128 = tmp128+acc128;
acc128 = acc128+const128;
acc128 = (size8s_t) (acc128 >> 31);
acc64 = sxt128->64(acc128);
Rd = sat32(acc64);
Syntax Behavior
Rd=cmpyrw(Rss,Rtt*):<<1:rnd: tmp128 = sxt64->128((Rss.w[0] * Rtt.w[0]));
sat acc128 = sxt64->128((Rss.w[1] * Rtt.w[1]));
const128 = sxt64->128(0x40000000);
acc128 = tmp128+acc128;
acc128 = acc128+const128;
acc128 = (size8s_t) (acc128 >> 31);
acc64 = sxt128->64(acc128);
Rd = sat32(acc64);
Notes
■ This instruction can only execute on a core with the Hexagon audio extensions
■ A packet with this instruction cannot have a slot 2 multiply instruction.
■ If saturation occurs during execution of this instruction (a result is clamped to either
maximum or minimum values), the OVF bit in the status register is set. OVF remains set until
explicitly cleared by a transfer to the status register.
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
1 1 1 0 1 0 0 0 0 1 1 s s s s s P P 0 t t t t t 0 1 0 d d d d d Rdd=cmpyiw(Rss,Rtt)
1 1 1 0 1 0 0 0 1 0 0 s s s s s P P 0 t t t t t 0 1 0 d d d d d Rdd=cmpyrw(Rss,Rtt)
1 1 1 0 1 0 0 0 1 1 0 s s s s s P P 0 t t t t t 0 1 0 d d d d d Rdd=cmpyrw(Rss,Rtt*)
1 1 1 0 1 0 0 0 1 1 1 s s s s s P P 0 t t t t t 0 1 0 d d d d d Rdd=cmpyiw(Rss,Rtt*)
1 1 1 0 1 0 0 1 0 0 0 s s s s s P P 0 t t t t t 1 0 0 d d d d d Rd=cmpyiw(Rss,Rtt*):<<1:s
at
1 1 1 0 1 0 0 1 0 0 1 s s s s s P P 0 t t t t t 0 0 0 d d d d d Rd=cmpyiw(Rss,Rtt):<<1:s
at
Rd=cmpyrw(Rss,Rtt):<<1:s
1 1 1 0 1 0 0 1 0 1 0 s s s s s P P 0 t t t t t 0 0 0 d d d d d at
1 1 1 0 1 0 0 1 0 1 1 s s s s s P P 0 t t t t t 0 0 0 d d d d d Rd=cmpyrw(Rss,Rtt*):<<1:
sat
1 1 1 0 1 0 0 1 1 0 0 s s s s s P P 0 t t t t t 1 0 0 d d d d d Rd=cmpyiw(Rss,Rtt*):<<1:r
nd:sat
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Rd=cmpyiw(Rss,Rtt):<<1:rn
1 1 1 0 1 0 0 1 1 0 1 s s s s s P P 0 t t t t t 0 0 0 d d d d d
d:sat
1 1 1 0 1 0 0 1 1 1 0 s s s s s P P 0 t t t t t 0 0 0 d d d d d Rd=cmpyrw(Rss,Rtt):<<1:r
nd:sat
Rd=cmpyrw(Rss,Rtt*):<<1:r
1 1 1 0 1 0 0 1 1 1 1 s s s s s P P 0 t t t t t 0 0 0 d d d d d
nd:sat
ICLASS RegType MajOp s5 Parse t5 MinOp x5
1 1 1 0 1 0 1 0 0 1 0 s s s s s P P 0 t t t t t 1 1 0 x x x x x Rxx+=cmpyiw(Rss,Rtt*)
1 1 1 0 1 0 1 0 0 1 1 s s s s s P P 0 t t t t t 0 1 0 x x x x x Rxx+=cmpyiw(Rss,Rtt)
1 1 1 0 1 0 1 0 1 0 0 s s s s s P P 0 t t t t t 0 1 0 x x x x x Rxx+=cmpyrw(Rss,Rtt)
1 1 1 0 1 0 1 0 1 1 0 s s s s s P P 0 t t t t t 0 1 0 x x x x x Rxx+=cmpyrw(Rss,Rtt*)
Rxx+=vcmpyi(Rss,Rtt):sat
I R I R Rss
I R I R Rtt
* * * *
32 32 32 32
Add Add
Sat_32 Sat_32 32
32
Rxx
Syntax Behavior
Rdd=vcmpyi(Rss,Rtt)[:<<1]:sa Rdd.w[0]=sat32((Rss.h[1] * Rtt.h[0]) + (Rss.h[0] *
t Rtt.h[1])[<<1]);
Rdd.w[1]=sat32((Rss.h[3] * Rtt.h[2]) + (Rss.h[2] *
Rtt.h[3])[<<1]);
Rdd=vcmpyr(Rss,Rtt)[:<<1]:sa Rdd.w[0]=sat32((Rss.h[0] * Rtt.h[0]) - (Rss.h[1] *
t Rtt.h[1])[<<1]);
Rdd.w[1]=sat32((Rss.h[2] * Rtt.h[2]) - (Rss.h[3] *
Rtt.h[3])[<<1]);
Syntax Behavior
Rxx+=vcmpyi(Rss,Rtt):sat Rxx.w[0]=sat32(Rxx.w[0] + (Rss.h[1] * Rtt.h[0]) +
(Rss.h[0] * Rtt.h[1])<<0);
Rxx.w[1]=sat32(Rxx.w[1] + (Rss.h[3] * Rtt.h[2]) +
(Rss.h[2] * Rtt.h[3])<<0);
Rxx+=vcmpyr(Rss,Rtt):sat Rxx.w[0]=sat32(Rxx.w[0] + (Rss.h[0] * Rtt.h[0]) -
(Rss.h[1] * Rtt.h[1])<<0);
Rxx.w[1]=sat32(Rxx.w[1] + (Rss.h[2] * Rtt.h[2]) -
(Rss.h[3] * Rtt.h[3])<<0);
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
Rdd=vcmpyr(Rss,Rtt)[:<<N]
1 1 1 0 1 0 0 0 N 0 1 s s s s s P P 0 t t t t t 1 1 0 d d d d d :sat
1 1 1 0 1 0 0 0 N 1 0 s s s s s P P 0 t t t t t 1 1 0 d d d d d Rdd=vcmpyi(Rss,Rtt)[:<<N]
:sat
ICLASS RegType MajOp s5 Parse t5 MinOp x5
1 1 1 0 1 0 1 0 0 0 1 s s s s s P P 0 t t t t t 1 0 0 x x x x x Rxx+=vcmpyr(Rss,Rtt):sat
1 1 1 0 1 0 1 0 0 1 0 s s s s s P P 0 t t t t t 1 0 0 x x x x x Rxx+=vcmpyi(Rss,Rtt):sat
Syntax Behavior
Rdd=vconj(Rss):sat Rdd.h[1]=sat16(-Rss.h[1]);
Rdd.h[0]=Rss.h[0];
Rdd.h[3]=sat16(-Rss.h[3]);
Rdd.h[2]=Rss.h[2];
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 0 0 0 0 1 0 0 s s s s s P P - - - - - - 1 1 1 d d d d d Rdd=vconj(Rss):sat
Syntax Behavior
Rdd=vcrotate(Rss,Rt) tmp = Rt[1:0];
if (tmp == 0) {
Rdd.h[0]=Rss.h[0];
Rdd.h[1]=Rss.h[1];
} else if (tmp == 1) {
Rdd.h[0]=Rss.h[1];
Rdd.h[1]=sat16(-Rss.h[0]);
} else if (tmp == 2) {
Rdd.h[0]=sat16(-Rss.h[1]);
Rdd.h[1]=Rss.h[0];
} else {
Rdd.h[0]=sat16(-Rss.h[0]);
Rdd.h[1]=sat16(-Rss.h[1]);
}
tmp = Rt[3:2];
if (tmp == 0) {
Rdd.h[2]=Rss.h[2];
Rdd.h[3]=Rss.h[3];
} else if (tmp == 1) {
Rdd.h[2]=Rss.h[3];
Rdd.h[3]=sat16(-Rss.h[2]);
} else if (tmp == 2) {
Rdd.h[2]=sat16(-Rss.h[3]);
Rdd.h[3]=Rss.h[2];
} else {
Rdd.h[2]=sat16(-Rss.h[2]);
Rdd.h[3]=sat16(-Rss.h[3]);
}
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType Maj s5 Parse t5 Min d5
1 1 0 0 0 0 1 1 1 1 - s s s s s P P - t t t t t 0 0 - d d d d d Rdd=vcrotate(Rss,Rt)
Rdd=vrcmpys(Rss,Rt):<<1:sat
I R I R Rss
Rt b a b a Rt
* * * *
32 32 32 32
Add Add
Sat_32 Sat_32
I Rdd
Syntax Behavior
Rdd=vrcmpys(Rss,Rt):<<1:sat if ("Rt & 1") {
Assembler mapped to:
"Rdd=vrcmpys(Rss,Rtt):<<1:sat:raw:hi";
} else {
Assembler mapped to:
"Rdd=vrcmpys(Rss,Rtt):<<1:sat:raw:lo";
}
Rdd=vrcmpys(Rss,Rtt):<<1:sat:raw Rdd.w[1]=sat32((Rss.h[1] * Rtt.w[1].h[0])<<1 +
:hi (Rss.h[3] * Rtt.w[1].h[1])<<1);
Rdd.w[0]=sat32((Rss.h[0] * Rtt.w[1].h[0])<<1 +
(Rss.h[2] * Rtt.w[1].h[1])<<1);
Syntax Behavior
Rdd=vrcmpys(Rss,Rtt):<<1:sat:raw Rdd.w[1]=sat32((Rss.h[1] * Rtt.w[0].h[0])<<1 +
:lo (Rss.h[3] * Rtt.w[0].h[1])<<1);
Rdd.w[0]=sat32((Rss.h[0] * Rtt.w[0].h[0])<<1 +
(Rss.h[2] * Rtt.w[0].h[1])<<1);
Rxx+=vrcmpys(Rss,Rt):<<1:sat if ("Rt & 1") {
Assembler mapped to:
"Rxx+=vrcmpys(Rss,Rtt):<<1:sat:raw:hi";
} else {
Assembler mapped to:
"Rxx+=vrcmpys(Rss,Rtt):<<1:sat:raw:lo";
}
Rxx+=vrcmpys(Rss,Rtt):<<1:sat:ra Rxx.w[1]=sat32(Rxx.w[1] + (Rss.h[1] *
w:hi Rtt.w[1].h[0])<<1 + (Rss.h[3] *
Rtt.w[1].h[1])<<1);
Rxx.w[0]=sat32(Rxx.w[0] + (Rss.h[0] *
Rtt.w[1].h[0])<<1 + (Rss.h[2] *
Rtt.w[1].h[1])<<1);
Rxx+=vrcmpys(Rss,Rtt):<<1:sat:ra Rxx.w[1]=sat32(Rxx.w[1] + (Rss.h[1] *
w:lo Rtt.w[0].h[0])<<1 + (Rss.h[3] *
Rtt.w[0].h[1])<<1);
Rxx.w[0]=sat32(Rxx.w[0] + (Rss.h[0] *
Rtt.w[0].h[0])<<1 + (Rss.h[2] *
Rtt.w[0].h[1])<<1);
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
Rdd=vrcmpys(Rss,Rtt):<<1
1 1 1 0 1 0 0 0 1 0 1 s s s s s P P 0 t t t t t 1 0 0 d d d d d :sat:raw:hi
Rdd=vrcmpys(Rss,Rtt):<<1
1 1 1 0 1 0 0 0 1 1 1 s s s s s P P 0 t t t t t 1 0 0 d d d d d
:sat:raw:lo
ICLASS RegType MajOp s5 Parse t5 MinOp x5
Rxx+=vrcmpys(Rss,Rtt):<<
1 1 1 0 1 0 1 0 1 0 1 s s s s s P P 0 t t t t t 1 0 0 x x x x x
1:sat:raw:hi
Rxx+=vrcmpys(Rss,Rtt):<<
1 1 1 0 1 0 1 0 1 1 1 s s s s s P P 0 t t t t t 1 0 0 x x x x x
1:sat:raw:lo
Rd=vrcmpys(Rss,Rt):<<1:rnd:sat
I R I R Rss
Rt b a b a Rt
* * * *
32 32 32 32
Add Add
Sat_32 Sat_32
I R Rd
Syntax Behavior
Rd=vrcmpys(Rss,Rt):<<1:rnd:sat if ("Rt & 1") {
Assembler mapped to:
"Rd=vrcmpys(Rss,Rtt):<<1:rnd:sat:raw:hi";
} else {
Assembler mapped to:
"Rd=vrcmpys(Rss,Rtt):<<1:rnd:sat:raw:lo";
}
Rd=vrcmpys(Rss,Rtt):<<1:rnd:sat:ra Rd.h[1]=sat32((Rss.h[1] * Rtt.w[1].h[0])<<1 +
w:hi (Rss.h[3] * Rtt.w[1].h[1])<<1 + 0x8000).h[1];
Rd.h[0]=sat32((Rss.h[0] * Rtt.w[1].h[0])<<1 +
(Rss.h[2] * Rtt.w[1].h[1])<<1 + 0x8000).h[1];
Syntax Behavior
Rd=vrcmpys(Rss,Rtt):<<1:rnd:sat:ra Rd.h[1]=sat32((Rss.h[1] * Rtt.w[0].h[0])<<1 +
w:lo (Rss.h[3] * Rtt.w[0].h[1])<<1 + 0x8000).h[1];
Rd.h[0]=sat32((Rss.h[0] * Rtt.w[0].h[0])<<1 +
(Rss.h[2] * Rtt.w[0].h[1])<<1 + 0x8000).h[1];
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
Rd=vrcmpys(Rss,Rtt):<<1:r
1 1 1 0 1 0 0 1 1 - 1 s s s s s P P 0 t t t t t 1 1 0 d d d d d nd:sat:raw:hi
1 1 1 0 1 0 0 1 1 - 1 s s s s s P P 0 t t t t t 1 1 1 d d d d d Rd=vrcmpys(Rss,Rtt):<<1:r
nd:sat:raw:lo
Rt
1 j -1 -j 1 j -1 -j 1 j -1 -j 1 j -1 -j
* * * *
+ +
I R Rxx
Syntax Behavior
Rdd=vrcrotate(Rss,Rt,#u2) sumr = 0;
sumi = 0;
control = Rt.ub[#u];
for (i = 0; i < 8; i += 2) {
tmpr = Rss.b[i];
tmpi = Rss.b[i+1];
switch (control & 3) {
case 0: sumr += tmpr;
sumi += tmpi;
break;
case 1: sumr += tmpi;
sumi -= tmpr;
break;
case 2: sumr -= tmpi;
sumi += tmpr;
break;
case 3: sumr -= tmpr;
sumi -= tmpi;
break;
}
control = control >> 2;
}
Rdd.w[0]=sumr;
Rdd.w[1]=sumi;
Rxx+=vrcrotate(Rss,Rt,#u2) sumr = 0;
sumi = 0;
control = Rt.ub[#u];
for (i = 0; i < 8; i += 2) {
tmpr = Rss.b[i];
tmpi = Rss.b[i+1];
switch (control & 3) {
case 0: sumr += tmpr;
sumi += tmpi;
break;
case 1: sumr += tmpi;
sumi -= tmpr;
break;
case 2: sumr -= tmpi;
sumi += tmpr;
break;
case 3: sumr -= tmpr;
sumi -= tmpi;
break;
}
control = control >> 2;
}
Rxx.w[0]=Rxx.w[0] + sumr;
Rxx.w[1]=Rxx.w[1] + sumi;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType Maj s5 Parse t5 Min d5
1 1 0 0 0 0 1 1 1 1 - s s s s s P P i t t t t t 1 1 i d d d d d Rdd=vrcrotate(Rss,Rt,#u2)
ICLASS RegType Maj s5 Parse t5 x5
1 1 0 0 1 0 1 1 1 0 1 s s s s s P P i t t t t t - - i x x x x x Rxx+=vrcrotate(Rss,Rt,#u2
)
11.10.4 XTYPE FP
The XTYPE FP instruction subclass includes instructions for floating point math.
Syntax Behavior
Rd=sfadd(Rs,Rt) Rd=Rs+Rt;
Rdd=dfadd(Rss,Rtt) Rdd=Rss+Rtt;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
1 1 1 0 1 0 0 0 0 0 0 s s s s s P P 0 t t t t t 0 1 1 d d d d d Rdd=dfadd(Rss,Rtt)
1 1 1 0 1 0 1 1 0 0 0 s s s s s P P 0 t t t t t 0 0 0 d d d d d Rd=sfadd(Rs,Rt)
Syntax Behavior
Pd=dfclass(Rss,#u5) Pd = 0;
class = fpclassify(Rss);
if (#u.0 && (class == FP_ZERO)) Pd = 0xff;
if (#u.1 && (class == FP_NORMAL)) Pd =
0xff;
if (#u.2 && (class == FP_SUBNORMAL)) Pd =
0xff;
if (#u.3 && (class == FP_INFINITE)) Pd =
0xff;
if (#u.4 && (class == FP_NAN)) Pd = 0xff;
cancel_flags();
Pd=sfclass(Rs,#u5) Pd = 0;
class = fpclassify(Rs);
if (#u.0 && (class == FP_ZERO)) Pd = 0xff;
if (#u.1 && (class == FP_NORMAL)) Pd =
0xff;
if (#u.2 && (class == FP_SUBNORMAL)) Pd =
0xff;
if (#u.3 && (class == FP_INFINITE)) Pd =
0xff;
if (#u.4 && (class == FP_NAN)) Pd = 0xff;
cancel_flags();
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse d2
1 0 0 0 0 1 0 1 1 1 1 s s s s s P P 0 i i i i i - - - - - - d d Pd=sfclass(Rs,#u5)
ICLASS RegType s5 Parse d2
1 1 0 1 1 1 0 0 1 0 0 s s s s s P P - 0 0 0 i i i i i 1 0 - d d Pd=dfclass(Rss,#u5)
Syntax Behavior
Pd=dfcmp.eq(Rss,Rtt) Pd=Rss==Rtt ? 0xff : 0x00;
Pd=dfcmp.ge(Rss,Rtt) Pd=Rss>=Rtt ? 0xff : 0x00;
Pd=dfcmp.gt(Rss,Rtt) Pd=Rss>Rtt ? 0xff : 0x00;
Pd=dfcmp.uo(Rss,Rtt) Pd=isunordered(Rss,Rtt) ? 0xff : 0x00;
Pd=sfcmp.eq(Rs,Rt) Pd=Rs==Rt ? 0xff : 0x00;
Pd=sfcmp.ge(Rs,Rt) Pd=Rs>=Rt ? 0xff : 0x00;
Pd=sfcmp.gt(Rs,Rt) Pd=Rs>Rt ? 0xff : 0x00;
Pd=sfcmp.uo(Rs,Rt) Pd=isunordered(Rs,Rt) ? 0xff : 0x00;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType Maj s5 Parse t5 Min d2
1 1 0 0 0 1 1 1 1 1 1 s s s s s P P - t t t t t 0 0 0 - - - d d Pd=sfcmp.ge(Rs,Rt)
1 1 0 0 0 1 1 1 1 1 1 s s s s s P P - t t t t t 0 0 1 - - - d d Pd=sfcmp.uo(Rs,Rt)
1 1 0 0 0 1 1 1 1 1 1 s s s s s P P - t t t t t 0 1 1 - - - d d Pd=sfcmp.eq(Rs,Rt)
1 1 0 0 0 1 1 1 1 1 1 s s s s s P P - t t t t t 1 0 0 - - - d d Pd=sfcmp.gt(Rs,Rt)
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 MinOp d2
1 1 0 1 0 0 1 0 1 1 1 s s s s s P P - t t t t t 0 0 0 - - - d d Pd=dfcmp.eq(Rss,Rtt)
1 1 0 1 0 0 1 0 1 1 1 s s s s s P P - t t t t t 0 0 1 - - - d d Pd=dfcmp.gt(Rss,Rtt)
1 1 0 1 0 0 1 0 1 1 1 s s s s s P P - t t t t t 0 1 0 - - - d d Pd=dfcmp.ge(Rss,Rtt)
1 1 0 1 0 0 1 0 1 1 1 s s s s s P P - t t t t t 0 1 1 - - - d d Pd=dfcmp.uo(Rss,Rtt)
Syntax Behavior
Rd=convert_df2sf(Rss) Rd = conv_df_to_sf(Rss);
Rdd=convert_sf2df(Rs) Rdd = conv_sf_to_df(Rs);
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 0 1 0 0 1 - - s s s s s P P - - - - - - 0 0 0 d d d d d Rdd=convert_sf2df(Rs)
1 0 0 0 1 0 0 0 0 0 0 s s s s s P P - - - - - - 0 0 1 d d d d d Rd=convert_df2sf(Rss)
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 0 0 0 0 1 1 1 s s s s s P P 0 - - - - - 0 1 0 d d d d d Rdd=convert_ud2df(Rss)
1 0 0 0 0 0 0 0 1 1 1 s s s s s P P 0 - - - - - 0 1 1 d d d d d Rdd=convert_d2df(Rss)
1 0 0 0 0 1 0 0 1 - - s s s s s P P - - - - - - 0 0 1 d d d d d Rdd=convert_uw2df(Rs)
1 0 0 0 0 1 0 0 1 - - s s s s s P P - - - - - - 0 1 0 d d d d d Rdd=convert_w2df(Rs)
1 0 0 0 1 0 0 0 0 0 1 s s s s s P P - - - - - - 0 0 1 d d d d d Rd=convert_ud2sf(Rss)
1 0 0 0 1 0 0 0 0 1 0 s s s s s P P - - - - - - 0 0 1 d d d d d Rd=convert_d2sf(Rss)
1 0 0 0 1 0 1 1 0 0 1 s s s s s P P - - - - - - 0 0 0 d d d d d Rd=convert_uw2sf(Rs)
1 0 0 0 1 0 1 1 0 1 0 s s s s s P P - - - - - - 0 0 0 d d d d d Rd=convert_w2sf(Rs)
Syntax Behavior
Rd=convert_df2uw(Rss) Rd = conv_df_to_4u(Rss).uw[0];
Rd=convert_df2uw(Rss):chop round_to_zero();
Rd = conv_df_to_4u(Rss).uw[0];
Rd=convert_df2w(Rss) Rd = conv_df_to_4s(Rss).s32;
Rd=convert_df2w(Rss):chop round_to_zero();
Rd = conv_df_to_4s(Rss).s32;
Rd=convert_sf2uw(Rs) Rd = conv_sf_to_4u(Rs).uw[0];
Rd=convert_sf2uw(Rs):chop round_to_zero();
Rd = conv_sf_to_4u(Rs).uw[0];
Rd=convert_sf2w(Rs) Rd = conv_sf_to_4s(Rs).s32;
Rd=convert_sf2w(Rs):chop round_to_zero();
Rd = conv_sf_to_4s(Rs).s32;
Rdd=convert_df2d(Rss) Rdd = conv_df_to_8s(Rss).s64;
Rdd=convert_df2d(Rss):chop round_to_zero();
Rdd = conv_df_to_8s(Rss).s64;
Rdd=convert_df2ud(Rss) Rdd = conv_df_to_8u(Rss).u64;
Rdd=convert_df2ud(Rss):chop round_to_zero();
Rdd = conv_df_to_8u(Rss).u64;
Rdd=convert_sf2d(Rs) Rdd = conv_sf_to_8s(Rs).s64;
Rdd=convert_sf2d(Rs):chop round_to_zero();
Rdd = conv_sf_to_8s(Rs).s64;
Rdd=convert_sf2ud(Rs) Rdd = conv_sf_to_8u(Rs).u64;
Rdd=convert_sf2ud(Rs):chop round_to_zero();
Rdd = conv_sf_to_8u(Rs).u64;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 0 0 0 0 1 1 1 s s s s s P P 0 - - - - - 0 0 0 d d d d d Rdd=convert_df2d(Rss)
1 0 0 0 0 0 0 0 1 1 1 s s s s s P P 0 - - - - - 0 0 1 d d d d d Rdd=convert_df2ud(Rss)
1 0 0 0 0 0 0 0 1 1 1 s s s s s P P 0 - - - - - 1 1 0 d d d d d Rdd=convert_df2d(Rss):ch
op
Rdd=convert_df2ud(Rss):c
1 0 0 0 0 0 0 0 1 1 1 s s s s s P P 0 - - - - - 1 1 1 d d d d d hop
1 0 0 0 0 1 0 0 1 - - s s s s s P P - - - - - - 0 1 1 d d d d d Rdd=convert_sf2ud(Rs)
1 0 0 0 0 1 0 0 1 - - s s s s s P P - - - - - - 1 0 0 d d d d d Rdd=convert_sf2d(Rs)
1 0 0 0 0 1 0 0 1 - - s s s s s P P - - - - - - 1 0 1 d d d d d Rdd=convert_sf2ud(Rs):ch
op
Rdd=convert_sf2d(Rs):cho
1 0 0 0 0 1 0 0 1 - - s s s s s P P - - - - - - 1 1 0 d d d d d p
1 0 0 0 1 0 0 0 0 1 1 s s s s s P P - - - - - - 0 0 1 d d d d d Rd=convert_df2uw(Rss)
1 0 0 0 1 0 0 0 1 0 0 s s s s s P P - - - - - - 0 0 1 d d d d d Rd=convert_df2w(Rss)
1 0 0 0 1 0 0 0 1 0 1 s s s s s P P - - - - - - 0 0 1 d d d d d Rd=convert_df2uw(Rss):ch
op
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Rd=convert_df2w(Rss):cho
1 0 0 0 1 0 0 0 1 1 1 s s s s s P P - - - - - - 0 0 1 d d d d d
p
1 0 0 0 1 0 1 1 0 1 1 s s s s s P P - - - - - - 0 0 0 d d d d d Rd=convert_sf2uw(Rs)
Rd=convert_sf2uw(Rs):cho
1 0 0 0 1 0 1 1 0 1 1 s s s s s P P - - - - - - 0 0 1 d d d d d
p
1 0 0 0 1 0 1 1 1 0 0 s s s s s P P - - - - - - 0 0 0 d d d d d Rd=convert_sf2w(Rs)
1 0 0 0 1 0 1 1 1 0 0 s s s s s P P - - - - - - 0 0 1 d d d d d Rd=convert_sf2w(Rs):chop
Syntax Behavior
Rd=sffixupd(Rs,Rt) (Rs,Rt,Rd,adjust)=recip_common(Rs,Rt);
Rd = Rt;
Rd=sffixupn(Rs,Rt) (Rs,Rt,Rd,adjust)=recip_common(Rs,Rt);
Rd = Rs;
Rd=sffixupr(Rs) (Rs,Rd,adjust)=invsqrt_common(Rs);
Rd = Rs;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 1 0 1 1 1 0 1 s s s s s P P - - - - - - 0 0 0 d d d d d Rd=sffixupr(Rs)
ICLASS RegType MajOp s5 Parse t5 MinOp d5
1 1 1 0 1 0 1 1 1 1 0 s s s s s P P 0 t t t t t 0 0 0 d d d d d Rd=sffixupn(Rs,Rt)
1 1 1 0 1 0 1 1 1 1 0 s s s s s P P 0 t t t t t 0 0 1 d d d d d Rd=sffixupd(Rs,Rt)
Syntax Behavior
Rx+=sfmpy(Rs,Rt) Rx=fmaf(Rs,Rt,Rx);
Rx-=sfmpy(Rs,Rt) Rx=fmaf(-Rs,Rt,Rx);
Rxx+=dfmpyhh(Rss,Rtt) Rxx = Rss*Rtt with partial product Rxx;
Rxx+=dfmpylh(Rss,Rtt) Rxx += (Rss.uw[0] * (0x00100000 | zxt20->64(Rtt.uw[1])))
<< 1;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp x5
1 1 1 0 1 0 1 0 0 0 0 s s s s s P P 0 t t t t t 0 1 1 x x x x x Rxx+=dfmpylh(Rss,Rtt)
1 1 1 0 1 0 1 0 1 0 0 s s s s s P P 0 t t t t t 0 1 1 x x x x x Rxx+=dfmpyhh(Rss,Rtt)
1 1 1 0 1 1 1 1 0 0 0 s s s s s P P 0 t t t t t 1 0 0 x x x x x Rx+=sfmpy(Rs,Rt)
1 1 1 0 1 1 1 1 0 0 0 s s s s s P P 0 t t t t t 1 0 1 x x x x x Rx-=sfmpy(Rs,Rt)
Syntax Behavior
Rx+=sfmpy(Rs,Rt,Pu):scal PREDUSE_TIMING;
e if (isnan(Rx) || isnan(Rs) || isnan(Rt)) Rx =
NaN;
tmp=fmaf(Rs,Rt,Rx) * 2**(Pu);
if (!((Rx == 0.0) && is_true_zero(Rs*Rt))) Rx =
tmp;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 u2 x5
1 1 1 0 1 1 1 1 0 1 1 s s s s s P P 0 t t t t t 1 u u x x x x x Rx+=sfmpy(Rs,Rt,Pu):scal
e
Syntax Behavior
Rd,Pe=sfinvsqrta(Rs) if ((Rs,Rd,adjust)=invsqrt_common(Rs)) {
Pe = adjust;
idx = (Rs >> 17) & 0x7f;
mant = (invsqrt_lut[idx] << 15);
exp = 127 - ((exponent(Rs) - 127) >> 1)
- 1;
Rd = -1**Rs.31 * 1.MANT * 2**(exp-
BIAS);
}
Notes
■ This instruction provides a certain amount of accuracy. In future versions the accuracy may
increase. For future compatibility, avoid dependence on exact values.
■ The predicate generated by this instruction cannot be used as a .new predicate, nor can it be
automatically ANDed with another predicate.
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse e2 d5
1 0 0 0 1 0 1 1 1 1 1 s s s s s P P - - - - - - 0 e e d d d d d Rd,Pe=sfinvsqrta(Rs)
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp x5
1 1 1 0 1 1 1 1 0 0 0 s s s s s P P 0 t t t t t 1 1 0 x x x x x Rx+=sfmpy(Rs,Rt):lib
1 1 1 0 1 1 1 1 0 0 0 s s s s s P P 0 t t t t t 1 1 1 x x x x x Rx-=sfmpy(Rs,Rt):lib
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType Parse MinOp d5
1 1 0 1 0 1 1 0 0 0 i - - - - - P P i i i i i i i i i d d d d d Rd=sfmake(#u10):pos
1 1 0 1 0 1 1 0 0 1 i - - - - - P P i i i i i i i i i d d d d d Rd=sfmake(#u10):neg
ICLASS RegType Parse d5
1 1 0 1 1 0 0 1 0 0 i - - - - - P P i i i i i i i i i d d d d d Rdd=dfmake(#u10):pos
1 1 0 1 1 0 0 1 0 1 i - - - - - P P i i i i i i i i i d d d d d Rdd=dfmake(#u10):neg
Syntax Behavior
Rd=sfmax(Rs,Rt) Rd = fmaxf(Rs,Rt);
Rdd=dfmax(Rss,Rtt) Rdd = fmax(Rss,Rtt);
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
1 1 1 0 1 0 0 0 0 0 1 s s s s s P P 0 t t t t t 0 1 1 d d d d d Rdd=dfmax(Rss,Rtt)
1 1 1 0 1 0 1 1 1 0 0 s s s s s P P 0 t t t t t 0 0 0 d d d d d Rd=sfmax(Rs,Rt)
Syntax Behavior
Rd=sfmin(Rs,Rt) Rd = fmin(Rs,Rt);
Rdd=dfmin(Rss,Rtt) Rdd = fmin(Rss,Rtt);
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
1 1 1 0 1 0 0 0 1 1 0 s s s s s P P 0 t t t t t 0 1 1 d d d d d Rdd=dfmin(Rss,Rtt)
1 1 1 0 1 0 1 1 1 0 0 s s s s s P P 0 t t t t t 0 0 1 d d d d d Rd=sfmin(Rs,Rt)
Syntax Behavior
Rd=sfmpy(Rs,Rt) Rd=Rs*Rt;
Rdd=dfmpyfix(Rss,R if (is_denormal(Rss) && (df_exponent(Rtt) >= 512) &&
tt) is_normal(Rtt)) Rdd = Rss * 0x1.0p52;
else if (is_denormal(Rtt) && (df_exponent(Rss) >= 512) &&
is_normal(Rss)) Rdd = Rss * 0x1.0p-52;
else Rdd = Rss;
Rdd=dfmpyll(Rss,Rt prod = (Rss.uw[0] * Rtt.uw[0]);
t) Rdd = (prod >> 32) << 1;
if (prod.uw[0] != 0) Rdd.0 = 1;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
1 1 1 0 1 0 0 0 0 1 0 s s s s s P P 0 t t t t t 0 1 1 d d d d d Rdd=dfmpyfix(Rss,Rtt)
1 1 1 0 1 0 0 0 1 0 1 s s s s s P P 0 t t t t t 0 1 1 d d d d d Rdd=dfmpyll(Rss,Rtt)
1 1 1 0 1 0 1 1 0 1 0 s s s s s P P 0 t t t t t 0 0 0 d d d d d Rd=sfmpy(Rs,Rt)
Syntax Behavior
Rd,Pe=sfrecipa(Rs,Rt) if ((Rs,Rt,Rd,adjust)=recip_common(Rs,Rt))
{
Pe = adjust;
idx = (Rt >> 16) & 0x7f;
mant = (recip_lut[idx] << 15) | 1;
exp = 127 - (exponent(Rt) - 127) - 1;
Rd = -1**Rt.31 * 1.MANT * 2**(exp-
BIAS);
}
Notes
■ This instruction provides a certain amount of accuracy. In future versions the accuracy may
increase. For future compatibility, avoid dependence on exact values.
■ The predicate generated by this instruction cannot be used as a .new predicate, nor can it be
automatically ANDed with another predicate.
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 e2 d5
1 1 1 0 1 0 1 1 1 1 1 s s s s s P P 0 t t t t t 1 e e d d d d d Rd,Pe=sfrecipa(Rs,Rt)
Syntax Behavior
Rd=sfsub(Rs,Rt) Rd=Rs-Rt;
Rdd=dfsub(Rss,Rtt) Rdd=Rss-Rtt;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
1 1 1 0 1 0 0 0 1 0 0 s s s s s P P 0 t t t t t 0 1 1 d d d d d Rdd=dfsub(Rss,Rtt)
1 1 1 0 1 0 1 1 0 0 0 s s s s s P P 0 t t t t t 0 0 1 d d d d d Rd=sfsub(Rs,Rt)
Rs
Rt /#u8
*
64
Add
Low 32 bits 32
Rx
Syntax Behavior
Rd=+mpyi(Rs,#u8) apply_extension(#u);
Rd=Rs*#u;
Rd=-mpyi(Rs,#u8) Rd=Rs*-#u;
Rd=add(#u6,mpyi(Rs,#U6 apply_extension(#u);
)) Rd = #u + Rs*#U;
Rd=add(#u6,mpyi(Rs,Rt) apply_extension(#u);
) Rd = #u + Rs*Rt;
Rd=add(Ru,mpyi(#u6:2,R Rd = Ru + Rs*#u;
s))
Rd=add(Ru,mpyi(Rs,#u6) apply_extension(#u);
) Rd = Ru + Rs*#u;
Syntax Behavior
Rd=mpyi(Rs,#m9) if ("((#m9<0) && (#m9>-256))") {
Assembler mapped to: "Rd=-mpyi(Rs,#m9*(-
1))";
} else {
Assembler mapped to: "Rd=+mpyi(Rs,#m9)";
}
Rd=mpyi(Rs,Rt) Rd=Rs*Rt;
Rd=mpyui(Rs,Rt) Assembler mapped to: "Rd=mpyi(Rs,Rt)"
Rx+=mpyi(Rs,#u8) apply_extension(#u);
Rx=Rx + (Rs*#u);
Rx+=mpyi(Rs,Rt) Rx=Rx + Rs*Rt;
Rx-=mpyi(Rs,#u8) apply_extension(#u);
Rx=Rx - (Rs*#u);
Rx-=mpyi(Rs,Rt) Rx=Rx - Rs*Rt;
Ry=add(Ru,mpyi(Ry,Rs)) Ry = Ru + Rs*Ry;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 MinOp d5
1 1 0 1 0 1 1 1 0 i i s s s s s P P i t t t t t i i i d d d d d Rd=add(#u6,mpyi(Rs,Rt))
ICLASS RegType s5 Parse d5
1 1 0 1 1 0 0 0 I i i s s s s s P P i d d d d d i i i I I I I I Rd=add(#u6,mpyi(Rs,#U6))
ICLASS RegType s5 Parse d5 u5
1 1 0 1 1 1 1 1 0 i i s s s s s P P i d d d d d i i i u u u u u Rd=add(Ru,mpyi(#u6:2,Rs)
)
1 1 0 1 1 1 1 1 1 i i s s s s s P P i d d d d d i i i u u u u u Rd=add(Ru,mpyi(Rs,#u6))
ICLASS RegType MajOp s5 Parse y5 u5
1 1 1 0 0 0 1 1 0 0 0 s s s s s P P - y y y y y - - - u u u u u Ry=add(Ru,mpyi(Ry,Rs))
ICLASS RegType MajOp s5 Parse MinOp d5
1 1 1 0 0 0 0 0 0 - - s s s s s P P 0 i i i i i i i i d d d d d Rd=+mpyi(Rs,#u8)
1 1 1 0 0 0 0 0 1 - - s s s s s P P 0 i i i i i i i i d d d d d Rd=-mpyi(Rs,#u8)
ICLASS RegType MajOp s5 Parse MinOp x5
1 1 1 0 0 0 0 1 0 - - s s s s s P P 0 i i i i i i i i x x x x x Rx+=mpyi(Rs,#u8)
1 1 1 0 0 0 0 1 1 - - s s s s s P P 0 i i i i i i i i x x x x x Rx-=mpyi(Rs,#u8)
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
1 1 1 0 1 1 0 1 0 0 0 s s s s s P P 0 t t t t t 0 0 0 d d d d d Rd=mpyi(Rs,Rt)
ICLASS RegType MajOp s5 Parse t5 MinOp x5
1 1 1 0 1 1 1 1 0 0 0 s s s s s P P 0 t t t t t 0 0 0 x x x x x Rx+=mpyi(Rs,Rt)
1 1 1 0 1 1 1 1 1 0 0 s s s s s P P 0 t t t t t 0 0 0 x x x x x Rx-=mpyi(Rs,Rt)
Perform mixed precision vector multiply operations. A 32-bit word from vector Rss is multiplied
by a 16-bit halfword (either even or odd) from vector Rtt. The multiplication is performed as a
signed 32 × 16, which produces a 48-bit result. This result is optionally scaled left by one bit. This
result is then shifted right by 16 bits, optionally accumulated and then saturated to 32 bits.
This operation is available in vector form (vmpyweh/vmpywoh) and non-vector form (multiply
and use upper result).
mux mux
* *
48 0x0 0x8000 48 0x0 0x8000
<<0-1 <<0-1
mux mux
Add Add
>>16 >>16
Add Add
Sat_32 Sat_32 32
32
Rxx
Syntax Behavior
Rdd=vmpyweh(Rss,Rtt)[:<<1]:rnd:s Rdd.w[1]=sat32(((Rss.w[1] *
at Rtt.h[2])[<<1]+0x8000)>>16);
Rdd.w[0]=sat32(((Rss.w[0] *
Rtt.h[0])[<<1]+0x8000)>>16);
Syntax Behavior
Rdd=vmpyweh(Rss,Rtt)[:<<1]:sat Rdd.w[1]=sat32(((Rss.w[1] *
Rtt.h[2])[<<1])>>16);
Rdd.w[0]=sat32(((Rss.w[0] *
Rtt.h[0])[<<1])>>16);
Rdd=vmpywoh(Rss,Rtt)[:<<1]:rnd:s Rdd.w[1]=sat32(((Rss.w[1] *
at Rtt.h[3])[<<1]+0x8000)>>16);
Rdd.w[0]=sat32(((Rss.w[0] *
Rtt.h[1])[<<1]+0x8000)>>16);
Rdd=vmpywoh(Rss,Rtt)[:<<1]:sat Rdd.w[1]=sat32(((Rss.w[1] *
Rtt.h[3])[<<1])>>16);
Rdd.w[0]=sat32(((Rss.w[0] *
Rtt.h[1])[<<1])>>16);
Rxx+=vmpyweh(Rss,Rtt)[:<<1]:rnd: Rxx.w[1]=sat32(Rxx.w[1] + (((Rss.w[1] *
sat Rtt.h[2])[<<1]+0x8000)>>16));
Rxx.w[0]=sat32(Rxx.w[0] + (((Rss.w[0] *
Rtt.h[0])[<<1]+0x8000)>>16));
Rxx+=vmpyweh(Rss,Rtt)[:<<1]:sat Rxx.w[1]=sat32(Rxx.w[1] + (((Rss.w[1] *
Rtt.h[2])[<<1])>>16));
Rxx.w[0]=sat32(Rxx.w[0] + (((Rss.w[0] *
Rtt.h[0])[<<1])>>16));
Rxx+=vmpywoh(Rss,Rtt)[:<<1]:rnd: Rxx.w[1]=sat32(Rxx.w[1] + (((Rss.w[1] *
sat Rtt.h[3])[<<1]+0x8000)>>16));
Rxx.w[0]=sat32(Rxx.w[0] + (((Rss.w[0] *
Rtt.h[1])[<<1]+0x8000)>>16 ));
Rxx+=vmpywoh(Rss,Rtt)[:<<1]:sat Rxx.w[1]=sat32(Rxx.w[1] + (((Rss.w[1] *
Rtt.h[3])[<<1])>>16));
Rxx.w[0]=sat32(Rxx.w[0] + (((Rss.w[0] *
Rtt.h[1])[<<1])>>16 ));
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
1 1 1 0 1 0 0 0 N 0 0 s s s s s P P 0 t t t t t 1 0 1 d d d d d Rdd=vmpyweh(Rss,Rtt)[:<<
N]:sat
Rdd=vmpywoh(Rss,Rtt)[:<<
1 1 1 0 1 0 0 0 N 0 0 s s s s s P P 0 t t t t t 1 1 1 d d d d d N]:sat
1 1 1 0 1 0 0 0 N 0 1 s s s s s P P 0 t t t t t 1 0 1 d d d d d Rdd=vmpyweh(Rss,Rtt)[:<<
N]:rnd:sat
1 1 1 0 1 0 0 0 N 0 1 s s s s s P P 0 t t t t t 1 1 1 d d d d d Rdd=vmpywoh(Rss,Rtt)[:<<
N]:rnd:sat
ICLASS RegType MajOp s5 Parse t5 MinOp x5
1 1 1 0 1 0 1 0 N 0 0 s s s s s P P 0 t t t t t 1 0 1 x x x x x Rxx+=vmpyweh(Rss,Rtt)[:<
<N]:sat
Rxx+=vmpywoh(Rss,Rtt)[:<
1 1 1 0 1 0 1 0 N 0 0 s s s s s P P 0 t t t t t 1 1 1 x x x x x <N]:sat
1 1 1 0 1 0 1 0 N 0 1 s s s s s P P 0 t t t t t 1 0 1 x x x x x Rxx+=vmpyweh(Rss,Rtt)[:<
<N]:rnd:sat
1 1 1 0 1 0 1 0 N 0 1 s s s s s P P 0 t t t t t 1 1 1 x x x x x Rxx+=vmpywoh(Rss,Rtt)[:<
<N]:rnd:sat
mux mux
* *
48 0x0 0x8000 48 0x0 0x8000
<<0-1 <<0-1
mux mux
Add Add
>>16 >>16
Add Add
Sat_32 Sat_32 32
32
Rxx
Syntax Behavior
Rdd=vmpyweuh(Rss,Rtt)[:<<1]:rnd:s Rdd.w[1]=sat32(((Rss.w[1] *
at Rtt.uh[2])[<<1]+0x8000)>>16);
Rdd.w[0]=sat32(((Rss.w[0] *
Rtt.uh[0])[<<1]+0x8000)>>16);
Syntax Behavior
Rdd=vmpyweuh(Rss,Rtt)[:<<1]:sat Rdd.w[1]=sat32(((Rss.w[1] *
Rtt.uh[2])[<<1])>>16);
Rdd.w[0]=sat32(((Rss.w[0] *
Rtt.uh[0])[<<1])>>16);
Rdd=vmpywouh(Rss,Rtt)[:<<1]:rnd:s Rdd.w[1]=sat32(((Rss.w[1] *
at Rtt.uh[3])[<<1]+0x8000)>>16);
Rdd.w[0]=sat32(((Rss.w[0] *
Rtt.uh[1])[<<1]+0x8000)>>16);
Rdd=vmpywouh(Rss,Rtt)[:<<1]:sat Rdd.w[1]=sat32(((Rss.w[1] *
Rtt.uh[3])[<<1])>>16);
Rdd.w[0]=sat32(((Rss.w[0] *
Rtt.uh[1])[<<1])>>16);
Rxx+=vmpyweuh(Rss,Rtt)[:<<1]:rnd: Rxx.w[1]=sat32(Rxx.w[1] + (((Rss.w[1] *
sat Rtt.uh[2])[<<1]+0x8000)>>16));
Rxx.w[0]=sat32(Rxx.w[0] + (((Rss.w[0] *
Rtt.uh[0])[<<1]+0x8000)>>16));
Rxx+=vmpyweuh(Rss,Rtt)[:<<1]:sat Rxx.w[1]=sat32(Rxx.w[1] + (((Rss.w[1] *
Rtt.uh[2])[<<1])>>16));
Rxx.w[0]=sat32(Rxx.w[0] + (((Rss.w[0] *
Rtt.uh[0])[<<1])>>16));
Rxx+=vmpywouh(Rss,Rtt)[:<<1]:rnd: Rxx.w[1]=sat32(Rxx.w[1] + (((Rss.w[1] *
sat Rtt.uh[3])[<<1]+0x8000)>>16));
Rxx.w[0]=sat32(Rxx.w[0] + (((Rss.w[0] *
Rtt.uh[1])[<<1]+0x8000)>>16 ));
Rxx+=vmpywouh(Rss,Rtt)[:<<1]:sat Rxx.w[1]=sat32(Rxx.w[1] + (((Rss.w[1] *
Rtt.uh[3])[<<1])>>16));
Rxx.w[0]=sat32(Rxx.w[0] + (((Rss.w[0] *
Rtt.uh[1])[<<1])>>16 ));
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
1 1 1 0 1 0 0 0 N 1 0 s s s s s P P 0 t t t t t 1 0 1 d d d d d Rdd=vmpyweuh(Rss,Rtt)[:<
<N]:sat
1 1 1 0 1 0 0 0 N 1 0 s s s s s P P 0 t t t t t 1 1 1 d d d d d Rdd=vmpywouh(Rss,Rtt)[:<
<N]:sat
Rdd=vmpyweuh(Rss,Rtt)[:<
1 1 1 0 1 0 0 0 N 1 1 s s s s s P P 0 t t t t t 1 0 1 d d d d d <N]:rnd:sat
1 1 1 0 1 0 0 0 N 1 1 s s s s s P P 0 t t t t t 1 1 1 d d d d d Rdd=vmpywouh(Rss,Rtt)[:<
<N]:rnd:sat
ICLASS RegType MajOp s5 Parse t5 MinOp x5
1 1 1 0 1 0 1 0 N 1 0 s s s s s P P 0 t t t t t 1 0 1 x x x x x Rxx+=vmpyweuh(Rss,Rtt)[:
<<N]:sat
1 1 1 0 1 0 1 0 N 1 0 s s s s s P P 0 t t t t t 1 1 1 x x x x x Rxx+=vmpywouh(Rss,Rtt)[:
<<N]:sat
Rxx+=vmpyweuh(Rss,Rtt)[:
1 1 1 0 1 0 1 0 N 1 1 s s s s s P P 0 t t t t t 1 0 1 x x x x x <<N]:rnd:sat
1 1 1 0 1 0 1 0 N 1 1 s s s s s P P 0 t t t t t 1 1 1 x x x x x Rxx+=vmpywouh(Rss,Rtt)[:
<<N]:rnd:sat
Rx+=mpy(Rs.[HL],Rt.[HL])[:<<1][:sat] Rxx+=mpy(Rs.[HL],Rt.[HL])[:<<1]
Rd = mpy(Rs.[HL],Rt.[HL])[:<<1][:rnd][:sat] Rdd = mpy(Rs.[HL],Rt.[HL])[:<<1][:rnd]
Rs Rt Rs Rt
16 x 16 16 x 16
0x0 0x8000
0x0 0x8000
32 32
<<0-1 <<0-1
mux
mux
Optional sat
to 32 bits
Rxx
Rx
Syntax Behavior
Rd=mpy(Rs.[HL],Rt.[HL])[:<<1][:rnd][: Rd=[sat32]([round]((Rs.h[01] *
sat] Rt.h[01])[<<1]));
Rdd=mpy(Rs.[HL],Rt.[HL])[:<<1][:rnd] Rdd=[round]((Rs.h[01] * Rt.h[01])[<<1]);
Rx+=mpy(Rs.[HL],Rt.[HL])[:<<1][:sat] Rx=[sat32](Rx+ (Rs.h[01] * Rt.h[01])[<<1]);
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
1 1 1 0 0 1 0 0 N 0 0 s s s s s P P - t t t t t - 0 0 d d d d d Rdd=mpy(Rs.L,Rt.L)[:<<N]
1 1 1 0 0 1 0 0 N 0 0 s s s s s P P - t t t t t - 0 1 d d d d d Rdd=mpy(Rs.L,Rt.H)[:<<N]
1 1 1 0 0 1 0 0 N 0 0 s s s s s P P - t t t t t - 1 0 d d d d d Rdd=mpy(Rs.H,Rt.L)[:<<N]
1 1 1 0 0 1 0 0 N 0 0 s s s s s P P - t t t t t - 1 1 d d d d d Rdd=mpy(Rs.H,Rt.H)[:<<N]
1 1 1 0 0 1 0 0 N 0 1 s s s s s P P - t t t t t - 0 0 d d d d d Rdd=mpy(Rs.L,Rt.L)[:<<N]:
rnd
1 1 1 0 0 1 0 0 N 0 1 s s s s s P P - t t t t t - 0 1 d d d d d Rdd=mpy(Rs.L,Rt.H)[:<<N]:
rnd
Rdd=mpy(Rs.H,Rt.L)[:<<N]:
1 1 1 0 0 1 0 0 N 0 1 s s s s s P P - t t t t t - 1 0 d d d d d rnd
1 1 1 0 0 1 0 0 N 0 1 s s s s s P P - t t t t t - 1 1 d d d d d Rdd=mpy(Rs.H,Rt.H)[:<<N]
:rnd
ICLASS RegType MajOp s5 Parse t5 MinOp x5
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Rxx+=mpy(Rs.L,Rt.L)[:<<N
1 1 1 0 0 1 1 0 N 0 0 s s s s s P P - t t t t t 0 0 0 x x x x x
]
1 1 1 0 0 1 1 0 N 0 0 s s s s s P P - t t t t t 0 0 1 x x x x x Rxx+=mpy(Rs.L,Rt.H)[:<<N
]
Rxx+=mpy(Rs.H,Rt.L)[:<<N
1 1 1 0 0 1 1 0 N 0 0 s s s s s P P - t t t t t 0 1 0 x x x x x
]
Rxx+=mpy(Rs.H,Rt.H)[:<<
1 1 1 0 0 1 1 0 N 0 0 s s s s s P P - t t t t t 0 1 1 x x x x x N]
1 1 1 0 0 1 1 0 N 0 1 s s s s s P P - t t t t t 0 0 0 x x x x x Rxx-=mpy(Rs.L,Rt.L)[:<<N]
1 1 1 0 0 1 1 0 N 0 1 s s s s s P P - t t t t t 0 0 1 x x x x x Rxx-=mpy(Rs.L,Rt.H)[:<<N]
1 1 1 0 0 1 1 0 N 0 1 s s s s s P P - t t t t t 0 1 0 x x x x x Rxx-=mpy(Rs.H,Rt.L)[:<<N]
Rxx-
1 1 1 0 0 1 1 0 N 0 1 s s s s s P P - t t t t t 0 1 1 x x x x x
=mpy(Rs.H,Rt.H)[:<<N]
ICLASS RegType MajOp s5 Parse t5 MinOp d5
1 1 1 0 1 1 0 0 N 0 0 s s s s s P P - t t t t t 0 0 0 d d d d d Rd=mpy(Rs.L,Rt.L)[:<<N]
1 1 1 0 1 1 0 0 N 0 0 s s s s s P P - t t t t t 0 0 1 d d d d d Rd=mpy(Rs.L,Rt.H)[:<<N]
1 1 1 0 1 1 0 0 N 0 0 s s s s s P P - t t t t t 0 1 0 d d d d d Rd=mpy(Rs.H,Rt.L)[:<<N]
1 1 1 0 1 1 0 0 N 0 0 s s s s s P P - t t t t t 0 1 1 d d d d d Rd=mpy(Rs.H,Rt.H)[:<<N]
Rd=mpy(Rs.L,Rt.L)[:<<N]:s
1 1 1 0 1 1 0 0 N 0 0 s s s s s P P - t t t t t 1 0 0 d d d d d at
1 1 1 0 1 1 0 0 N 0 0 s s s s s P P - t t t t t 1 0 1 d d d d d Rd=mpy(Rs.L,Rt.H)[:<<N]:s
at
1 1 1 0 1 1 0 0 N 0 0 s s s s s P P - t t t t t 1 1 0 d d d d d Rd=mpy(Rs.H,Rt.L)[:<<N]:s
at
Rd=mpy(Rs.H,Rt.H)[:<<N]:
1 1 1 0 1 1 0 0 N 0 0 s s s s s P P - t t t t t 1 1 1 d d d d d sat
1 1 1 0 1 1 0 0 N 0 1 s s s s s P P - t t t t t 0 0 0 d d d d d Rd=mpy(Rs.L,Rt.L)[:<<N]:r
nd
1 1 1 0 1 1 0 0 N 0 1 s s s s s P P - t t t t t 0 0 1 d d d d d Rd=mpy(Rs.L,Rt.H)[:<<N]:r
nd
Rd=mpy(Rs.H,Rt.L)[:<<N]:r
1 1 1 0 1 1 0 0 N 0 1 s s s s s P P - t t t t t 0 1 0 d d d d d nd
1 1 1 0 1 1 0 0 N 0 1 s s s s s P P - t t t t t 0 1 1 d d d d d Rd=mpy(Rs.H,Rt.H)[:<<N]:r
nd
1 1 1 0 1 1 0 0 N 0 1 s s s s s P P - t t t t t 1 0 0 d d d d d Rd=mpy(Rs.L,Rt.L)[:<<N]:r
nd:sat
Rd=mpy(Rs.L,Rt.H)[:<<N]:r
1 1 1 0 1 1 0 0 N 0 1 s s s s s P P - t t t t t 1 0 1 d d d d d nd:sat
1 1 1 0 1 1 0 0 N 0 1 s s s s s P P - t t t t t 1 1 0 d d d d d Rd=mpy(Rs.H,Rt.L)[:<<N]:r
nd:sat
Rd=mpy(Rs.H,Rt.H)[:<<N]:r
1 1 1 0 1 1 0 0 N 0 1 s s s s s P P - t t t t t 1 1 1 d d d d d nd:sat
1 1 1 0 1 1 1 0 N 0 0 s s s s s P P - t t t t t 1 0 0 x x x x x Rx+=mpy(Rs.L,Rt.L)[:<<N]:
sat
Rx+=mpy(Rs.L,Rt.H)[:<<N]:
1 1 1 0 1 1 1 0 N 0 0 s s s s s P P - t t t t t 1 0 1 x x x x x sat
1 1 1 0 1 1 1 0 N 0 0 s s s s s P P - t t t t t 1 1 0 x x x x x Rx+=mpy(Rs.H,Rt.L)[:<<N]:
sat
1 1 1 0 1 1 1 0 N 0 0 s s s s s P P - t t t t t 1 1 1 x x x x x Rx+=mpy(Rs.H,Rt.H)[:<<N]
:sat
1 1 1 0 1 1 1 0 N 0 1 s s s s s P P - t t t t t 0 0 0 x x x x x Rx-=mpy(Rs.L,Rt.L)[:<<N]
1 1 1 0 1 1 1 0 N 0 1 s s s s s P P - t t t t t 0 0 1 x x x x x Rx-=mpy(Rs.L,Rt.H)[:<<N]
1 1 1 0 1 1 1 0 N 0 1 s s s s s P P - t t t t t 0 1 0 x x x x x Rx-=mpy(Rs.H,Rt.L)[:<<N]
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
1 1 1 0 1 1 1 0 N 0 1 s s s s s P P - t t t t t 0 1 1 x x x x x Rx-=mpy(Rs.H,Rt.H)[:<<N]
Rx-
1 1 1 0 1 1 1 0 N 0 1 s s s s s P P - t t t t t 1 0 0 x x x x x
=mpy(Rs.L,Rt.L)[:<<N]:sat
Rx-
1 1 1 0 1 1 1 0 N 0 1 s s s s s P P - t t t t t 1 0 1 x x x x x
=mpy(Rs.L,Rt.H)[:<<N]:sat
Rx-
1 1 1 0 1 1 1 0 N 0 1 s s s s s P P - t t t t t 1 1 0 x x x x x
=mpy(Rs.H,Rt.L)[:<<N]:sat
Rx-
1 1 1 0 1 1 1 0 N 0 1 s s s s s P P - t t t t t 1 1 1 x x x x x
=mpy(Rs.H,Rt.H)[:<<N]:sat
Rx+=mpyu(Rs.[HL],Rt.[HL])[:<<1] Rxx+=mpyu(Rs.[HL],Rt.[HL])[:<<1]
Rd = mpyu(Rs.[HL],Rt.[HL])[:<<1] Rdd = mpyu(Rs.[HL],Rt.[HL])[:<<1]
Rs Rt Rs Rt
16 x 16 16 x 16
0x0
0x0
32 32
<<0-1 <<0-1
mux
mux
Rx
Rxx
Syntax Behavior
Rd=mpyu(Rs.[HL],Rt.[HL])[:<<1] Rd=(Rs.uh[01] * Rt.uh[01])[<<1];
Rdd=mpyu(Rs.[HL],Rt.[HL])[:<<1 Rdd=(Rs.uh[01] * Rt.uh[01])[<<1];
]
Rx+=mpyu(Rs.[HL],Rt.[HL])[:<<1 Rx=Rx+ (Rs.uh[01] * Rt.uh[01])[<<1];
]
Rx- Rx=Rx- (Rs.uh[01] * Rt.uh[01])[<<1];
=mpyu(Rs.[HL],Rt.[HL])[:<<1]
Rxx+=mpyu(Rs.[HL],Rt.[HL])[:<< Rxx=Rxx+ (Rs.uh[01] * Rt.uh[01])[<<1];
1]
Rxx- Rxx=Rxx- (Rs.uh[01] * Rt.uh[01])[<<1];
=mpyu(Rs.[HL],Rt.[HL])[:<<1]
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
Rdd=mpyu(Rs.L,Rt.L)[:<<N
1 1 1 0 0 1 0 0 N 1 0 s s s s s P P - t t t t t - 0 0 d d d d d ]
Rdd=mpyu(Rs.L,Rt.H)[:<<N
1 1 1 0 0 1 0 0 N 1 0 s s s s s P P - t t t t t - 0 1 d d d d d
]
Rdd=mpyu(Rs.H,Rt.L)[:<<N
1 1 1 0 0 1 0 0 N 1 0 s s s s s P P - t t t t t - 1 0 d d d d d
]
Rdd=mpyu(Rs.H,Rt.H)[:<<
1 1 1 0 0 1 0 0 N 1 0 s s s s s P P - t t t t t - 1 1 d d d d d
N]
ICLASS RegType MajOp s5 Parse t5 MinOp x5
Rxx+=mpyu(Rs.L,Rt.L)[:<<
1 1 1 0 0 1 1 0 N 1 0 s s s s s P P - t t t t t 0 0 0 x x x x x
N]
Rxx+=mpyu(Rs.L,Rt.H)[:<<
1 1 1 0 0 1 1 0 N 1 0 s s s s s P P - t t t t t 0 0 1 x x x x x
N]
Rxx+=mpyu(Rs.H,Rt.L)[:<<
1 1 1 0 0 1 1 0 N 1 0 s s s s s P P - t t t t t 0 1 0 x x x x x N]
1 1 1 0 0 1 1 0 N 1 0 s s s s s P P - t t t t t 0 1 1 x x x x x Rxx+=mpyu(Rs.H,Rt.H)[:<<
N]
1 1 1 0 0 1 1 0 N 1 1 s s s s s P P - t t t t t 0 0 0 x x x x x Rxx-
=mpyu(Rs.L,Rt.L)[:<<N]
Rxx-
1 1 1 0 0 1 1 0 N 1 1 s s s s s P P - t t t t t 0 0 1 x x x x x =mpyu(Rs.L,Rt.H)[:<<N]
1 1 1 0 0 1 1 0 N 1 1 s s s s s P P - t t t t t 0 1 0 x x x x x Rxx-
=mpyu(Rs.H,Rt.L)[:<<N]
1 1 1 0 0 1 1 0 N 1 1 s s s s s P P - t t t t t 0 1 1 x x x x x Rxx-
=mpyu(Rs.H,Rt.H)[:<<N]
ICLASS RegType MajOp s5 Parse t5 MinOp d5
1 1 1 0 1 1 0 0 N 1 0 s s s s s P P - t t t t t 0 0 0 d d d d d Rd=mpyu(Rs.L,Rt.L)[:<<N]
1 1 1 0 1 1 0 0 N 1 0 s s s s s P P - t t t t t 0 0 1 d d d d d Rd=mpyu(Rs.L,Rt.H)[:<<N]
1 1 1 0 1 1 0 0 N 1 0 s s s s s P P - t t t t t 0 1 0 d d d d d Rd=mpyu(Rs.H,Rt.L)[:<<N]
1 1 1 0 1 1 0 0 N 1 0 s s s s s P P - t t t t t 0 1 1 d d d d d Rd=mpyu(Rs.H,Rt.H)[:<<N]
ICLASS RegType MajOp s5 Parse t5 MinOp x5
Rx+=mpyu(Rs.L,Rt.L)[:<<N
1 1 1 0 1 1 1 0 N 1 0 s s s s s P P - t t t t t 0 0 0 x x x x x ]
1 1 1 0 1 1 1 0 N 1 0 s s s s s P P - t t t t t 0 0 1 x x x x x Rx+=mpyu(Rs.L,Rt.H)[:<<N
]
Rx+=mpyu(Rs.H,Rt.L)[:<<N
1 1 1 0 1 1 1 0 N 1 0 s s s s s P P - t t t t t 0 1 0 x x x x x ]
1 1 1 0 1 1 1 0 N 1 0 s s s s s P P - t t t t t 0 1 1 x x x x x Rx+=mpyu(Rs.H,Rt.H)[:<<
N]
1 1 1 0 1 1 1 0 N 1 1 s s s s s P P - t t t t t 0 0 0 x x x x x Rx-=mpyu(Rs.L,Rt.L)[:<<N]
1 1 1 0 1 1 1 0 N 1 1 s s s s s P P - t t t t t 0 0 1 x x x x x Rx-=mpyu(Rs.L,Rt.H)[:<<N]
1 1 1 0 1 1 1 0 N 1 1 s s s s s P P - t t t t t 0 1 0 x x x x x Rx-=mpyu(Rs.H,Rt.L)[:<<N]
1 1 1 0 1 1 1 0 N 1 1 s s s s s P P - t t t t t 0 1 1 x x x x x Rx-
=mpyu(Rs.H,Rt.H)[:<<N]
Perform a 32 × 32 carryless polynomial multiply using 32-bit source registers Rs and Rt. The 64-bit
result is optionally accumulated (XORed) with the destination register. Finite field multiply
instructions are useful for many algorithms including scramble code generation, cryptographic
algorithms, convolutional, and Reed Solomon codes.
Rxx += pmpyw(Rs,Rt)
Rs
Rt
32 x 32
carryless
polynomial
mpy *
XOR
Rxx
Syntax Behavior
Rdd=pmpyw(Rs,Rt) x = Rs.uw[0];
y = Rt.uw[0];
prod = 0;
for(i=0; i < 32; i++) {
if((y >> i) & 1) prod ^= (x << i);
}
Rdd = prod;
Rxx^=pmpyw(Rs,Rt) x = Rs.uw[0];
y = Rt.uw[0];
prod = 0;
for(i=0; i < 32; i++) {
if((y >> i) & 1) prod ^= (x << i);
}
Rxx ^= prod;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
1 1 1 0 0 1 0 1 0 1 0 s s s s s P P 0 t t t t t 1 1 1 d d d d d Rdd=pmpyw(Rs,Rt)
ICLASS RegType MajOp s5 Parse t5 MinOp x5
1 1 1 0 0 1 1 1 0 0 1 s s s s s P P 0 t t t t t 1 1 1 x x x x x Rxx^=pmpyw(Rs,Rt)
mux mux
* *
48 48
<<0-1 <<0-1
Add
Rxx
Syntax Behavior
Rdd=vrmpyweh(Rss,Rtt)[:<<1] Rdd = (Rss.w[1] * Rtt.h[2])[<<1] + (Rss.w[0] *
Rtt.h[0])[<<1];
Rdd=vrmpywoh(Rss,Rtt)[:<<1] Rdd = (Rss.w[1] * Rtt.h[3])[<<1] + (Rss.w[0] *
Rtt.h[1])[<<1];
Rxx+=vrmpyweh(Rss,Rtt)[:<<1] Rxx += (Rss.w[1] * Rtt.h[2])[<<1] + (Rss.w[0] *
Rtt.h[0])[<<1];
Rxx+=vrmpywoh(Rss,Rtt)[:<<1] Rxx += (Rss.w[1] * Rtt.h[3])[<<1] + (Rss.w[0] *
Rtt.h[1])[<<1];
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
1 1 1 0 1 0 0 0 N 0 1 s s s s s P P 0 t t t t t 0 1 0 d d d d d Rdd=vrmpywoh(Rss,Rtt)[:<
<N]
Rdd=vrmpyweh(Rss,Rtt)[:<
1 1 1 0 1 0 0 0 N 1 0 s s s s s P P 0 t t t t t 1 0 0 d d d d d <N]
1 1 1 0 1 0 1 0 N 1 1 s s s s s P P 0 t t t t t 1 1 0 x x x x x Rxx+=vrmpywoh(Rss,Rtt)[:
<<N]
Rs
Rt
32x32
64
Rd
Syntax Behavior
Rd=mpy(Rs,Rt.H):<<1:rnd:sat Rd = sat32(((Rs * Rt.h[1])<<1+0x8000)>>16);
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
1 1 1 0 1 1 0 1 0 0 1 s s s s s P P 0 t t t t t 0 0 1 d d d d d Rd=mpy(Rs,Rt):rnd
1 1 1 0 1 1 0 1 0 1 0 s s s s s P P 0 t t t t t 0 0 1 d d d d d Rd=mpyu(Rs,Rt)
1 1 1 0 1 1 0 1 0 1 1 s s s s s P P 0 t t t t t 0 0 1 d d d d d Rd=mpysu(Rs,Rt)
1 1 1 0 1 1 0 1 1 0 1 s s s s s P P 0 t t t t t 0 0 0 d d d d d Rd=mpy(Rs,Rt.H):<<1:sat
1 1 1 0 1 1 0 1 1 0 1 s s s s s P P 0 t t t t t 0 0 1 d d d d d Rd=mpy(Rs,Rt.L):<<1:sat
1 1 1 0 1 1 0 1 1 0 1 s s s s s P P 0 t t t t t 1 0 0 d d d d d Rd=mpy(Rs,Rt.H):<<1:rnd:
sat
1 1 1 0 1 1 0 1 1 1 1 s s s s s P P 0 t t t t t 0 0 0 d d d d d Rd=mpy(Rs,Rt):<<1:sat
1 1 1 0 1 1 0 1 1 1 1 s s s s s P P 0 t t t t t 1 0 0 d d d d d Rd=mpy(Rs,Rt.L):<<1:rnd:s
at
1 1 1 0 1 1 0 1 N 0 N s s s s s P P 0 t t t t t 0 N N d d d d d Rd=mpy(Rs,Rt)[:<<N]
ICLASS RegType MajOp s5 Parse t5 MinOp x5
1 1 1 0 1 1 1 1 0 1 1 s s s s s P P 0 t t t t t 0 0 0 x x x x x Rx+=mpy(Rs,Rt):<<1:sat
1 1 1 0 1 1 1 1 0 1 1 s s s s s P P 0 t t t t t 0 0 1 x x x x x Rx-=mpy(Rs,Rt):<<1:sat
Rs
Rt
32 x 32
64
64-bit add/sub
Rxx
Syntax Behavior
Rdd=mpy(Rs,Rt) Rdd=(Rs * Rt);
Rdd=mpyu(Rs,Rt) Rdd=(Rs.uw[0] * Rt.uw[0]);
Rxx[+-]=mpy(Rs,Rt) Rxx= Rxx [+-] (Rs * Rt);
Rxx[+-]=mpyu(Rs,Rt) Rxx= Rxx [+-] (Rs.uw[0] * Rt.uw[0]);
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
1 1 1 0 0 1 0 1 0 0 0 s s s s s P P 0 t t t t t 0 0 0 d d d d d Rdd=mpy(Rs,Rt)
1 1 1 0 0 1 0 1 0 1 0 s s s s s P P 0 t t t t t 0 0 0 d d d d d Rdd=mpyu(Rs,Rt)
ICLASS RegType MajOp s5 Parse t5 MinOp x5
1 1 1 0 0 1 1 1 0 0 0 s s s s s P P 0 t t t t t 0 0 0 x x x x x Rxx+=mpy(Rs,Rt)
1 1 1 0 0 1 1 1 0 0 1 s s s s s P P 0 t t t t t 0 0 0 x x x x x Rxx-=mpy(Rs,Rt)
1 1 1 0 0 1 1 1 0 1 0 s s s s s P P 0 t t t t t 0 0 0 x x x x x Rxx+=mpyu(Rs,Rt)
1 1 1 0 0 1 1 1 0 1 1 s s s s s P P 0 t t t t t 0 0 0 x x x x x Rxx-=mpyu(Rs,Rt)
Rxx+=vdmpy(Rss,Rtt):sat
Rss
Rtt
* * * *
32 32 32 32
Add Add
Sat_32 Sat_32 32
32
Rxx
Syntax Behavior
Rdd=vdmpy(Rss,Rtt):<<1:s Rdd.w[0]=sat32((Rss.h[0] * Rtt.h[0])<<1 + (Rss.h[1] *
at Rtt.h[1])<<1);
Rdd.w[1]=sat32((Rss.h[2] * Rtt.h[2])<<1 + (Rss.h[3] *
Rtt.h[3])<<1);
Rdd=vdmpy(Rss,Rtt):sat Rdd.w[0]=sat32((Rss.h[0] * Rtt.h[0])<<0 + (Rss.h[1] *
Rtt.h[1])<<0);
Rdd.w[1]=sat32((Rss.h[2] * Rtt.h[2])<<0 + (Rss.h[3] *
Rtt.h[3])<<0);
Syntax Behavior
Rxx+=vdmpy(Rss,Rtt):<<1: Rxx.w[0]=sat32(Rxx.w[0] + (Rss.h[0] * Rtt.h[0])<<1 +
sat (Rss.h[1] * Rtt.h[1])<<1);
Rxx.w[1]=sat32(Rxx.w[1] + (Rss.h[2] * Rtt.h[2])<<1 +
(Rss.h[3] * Rtt.h[3])<<1);
Rxx+=vdmpy(Rss,Rtt):sat Rxx.w[0]=sat32(Rxx.w[0] + (Rss.h[0] * Rtt.h[0])<<0 +
(Rss.h[1] * Rtt.h[1])<<0);
Rxx.w[1]=sat32(Rxx.w[1] + (Rss.h[2] * Rtt.h[2])<<0 +
(Rss.h[3] * Rtt.h[3])<<0);
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
Rdd=vdmpy(Rss,Rtt)[:<<N]:
1 1 1 0 1 0 0 0 N 0 0 s s s s s P P 0 t t t t t 1 0 0 d d d d d sat
Rd=vdmpy(Rss,Rtt):rnd:sat
Rss
Rtt
* * * *
32 32 32 32
Add Add
Sat_32 Sat_32
Rd
Syntax Behavior
Rd=vdmpy(Rss,Rtt)[:<<1]:rnd: Rd.h[0]=(sat32((Rss.h[0] * Rtt.h[0])[<<1] +
sat (Rss.h[1] * Rtt.h[1])[<<1] + 0x8000)).h[1];
Rd.h[1]=(sat32((Rss.h[2] * Rtt.h[2])[<<1] +
(Rss.h[3] * Rtt.h[3])[<<1] + 0x8000)).h[1];
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
Rd=vdmpy(Rss,Rtt)[:<<N]:r
1 1 1 0 1 0 0 1 N 0 0 s s s s s P P 0 t t t t t 0 0 0 d d d d d
nd:sat
Rss
Rtt
* * * * * * * *
16 16 16 16 16 16 16 16
Add Add
32 32
Rxx
Syntax Behavior
Rdd=vrmpybsu(Rss,Rtt Rdd.w[0]=((Rss.b[0] * Rtt.ub[0]) + (Rss.b[1] * Rtt.ub[1])
) + (Rss.b[2] * Rtt.ub[2]) + (Rss.b[3] * Rtt.ub[3]));
Rdd.w[1]=((Rss.b[4] * Rtt.ub[4]) + (Rss.b[5] * Rtt.ub[5])
+ (Rss.b[6] * Rtt.ub[6]) + (Rss.b[7] * Rtt.ub[7]));
Rdd=vrmpybu(Rss,Rtt) Rdd.w[0]=((Rss.ub[0] * Rtt.ub[0]) + (Rss.ub[1] *
Rtt.ub[1]) + (Rss.ub[2] * Rtt.ub[2]) + (Rss.ub[3] *
Rtt.ub[3]));
Rdd.w[1]=((Rss.ub[4] * Rtt.ub[4]) + (Rss.ub[5] *
Rtt.ub[5]) + (Rss.ub[6] * Rtt.ub[6]) + (Rss.ub[7] *
Rtt.ub[7]));
Rxx+=vrmpybsu(Rss,Rt Rxx.w[0]=(Rxx.w[0] + (Rss.b[0] * Rtt.ub[0]) + (Rss.b[1] *
t) Rtt.ub[1]) + (Rss.b[2] * Rtt.ub[2]) + (Rss.b[3] *
Rtt.ub[3]));
Rxx.w[1]=(Rxx.w[1] + (Rss.b[4] * Rtt.ub[4]) + (Rss.b[5] *
Rtt.ub[5]) + (Rss.b[6] * Rtt.ub[6]) + (Rss.b[7] *
Rtt.ub[7]));
Rxx+=vrmpybu(Rss,Rtt Rxx.w[0]=(Rxx.w[0] + (Rss.ub[0] * Rtt.ub[0]) + (Rss.ub[1]
) * Rtt.ub[1]) + (Rss.ub[2] * Rtt.ub[2]) + (Rss.ub[3] *
Rtt.ub[3]));
Rxx.w[1]=(Rxx.w[1] + (Rss.ub[4] * Rtt.ub[4]) + (Rss.ub[5]
* Rtt.ub[5]) + (Rss.ub[6] * Rtt.ub[6]) + (Rss.ub[7] *
Rtt.ub[7]));
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
1 1 1 0 1 0 0 0 1 0 0 s s s s s P P 0 t t t t t 0 0 1 d d d d d Rdd=vrmpybu(Rss,Rtt)
1 1 1 0 1 0 0 0 1 1 0 s s s s s P P 0 t t t t t 0 0 1 d d d d d Rdd=vrmpybsu(Rss,Rtt)
ICLASS RegType MajOp s5 Parse t5 MinOp x5
1 1 1 0 1 0 1 0 1 0 0 s s s s s P P 0 t t t t t 0 0 1 x x x x x Rxx+=vrmpybu(Rss,Rtt)
1 1 1 0 1 0 1 0 1 1 0 s s s s s P P 0 t t t t t 0 0 1 x x x x x Rxx+=vrmpybsu(Rss,Rtt)
Rss
Rtt
* * * * * * * *
16 16 16 16 16 16 16 16
Rxx
Syntax Behavior
Rdd=vdmpybsu(Rss,Rtt):sat Rdd.h[0]=sat16(((Rss.b[0] * Rtt.ub[0]) + (Rss.b[1]
* Rtt.ub[1])));
Rdd.h[1]=sat16((((Rss.b[2] * Rtt.ub[2]) +
(Rss.b[3] * Rtt.ub[3])));
Rdd.h[2]=sat16(((Rss.b[4] * Rtt.ub[4]) + (Rss.b[5]
* Rtt.ub[5])));
Rdd.h[3]=sat16(((Rss.b[6] * Rtt.ub[6]) + (Rss.b[7]
* Rtt.ub[7])));
Rxx+=vdmpybsu(Rss,Rtt):sat Rxx.h[0]=sat16((Rxx.h[0] + (Rss.b[0] * Rtt.ub[0])
+ (Rss.b[1] * Rtt.ub[1])));
Rxx.h[1]=sat16((Rxx.h[1] + (Rss.b[2] * Rtt.ub[2])
+ (Rss.b[3] * Rtt.ub[3])));
Rxx.h[2]=sat16((Rxx.h[2] + (Rss.b[4] * Rtt.ub[4])
+ (Rss.b[5] * Rtt.ub[5])));
Rxx.h[3]=sat16((Rxx.h[3] + (Rss.b[6] * Rtt.ub[6])
+ (Rss.b[7] * Rtt.ub[7])));
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
Rdd=vdmpybsu(Rss,Rtt):sa
1 1 1 0 1 0 0 0 1 0 1 s s s s s P P 0 t t t t t 0 0 1 d d d d d t
Rxx+=vmpyeh(Rss,Rtt):sat
Rss
Rtt
* *
32 32
<<0-1 <<0-1
Add Add
Sat32 Sat32 32
32
Rxx
Syntax Behavior
Rdd=vmpyeh(Rss,Rtt):<<1:sat Rdd.w[0]=sat32((Rss.h[0] * Rtt.h[0])<<1);
Rdd.w[1]=sat32((Rss.h[2] * Rtt.h[2])<<1);
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
1 1 1 0 1 0 0 0 N 0 0 s s s s s P P 0 t t t t t 1 1 0 d d d d d Rdd=vmpyeh(Rss,Rtt)[:<<N
]:sat
ICLASS RegType MajOp s5 Parse t5 MinOp x5
1 1 1 0 1 0 1 0 0 0 1 s s s s s P P 0 t t t t t 0 1 0 x x x x x Rxx+=vmpyeh(Rss,Rtt)
1 1 1 0 1 0 1 0 N 0 0 s s s s s P P 0 t t t t t 1 1 0 x x x x x Rxx+=vmpyeh(Rss,Rtt)[:<<
N]:sat
Rxx+=vmpyh(Rs,Rt):sat
Rs
Rt
* *
32 32
<<0-1 <<0-1
Add Add
Sat32 Sat32 32
32
Rxx
Syntax Behavior
Rdd=vmpyh(Rs,Rt)[:<<1]:sa Rdd.w[0]=sat32((Rs.h[0] * Rt.h[0])[<<1]);
t Rdd.w[1]=sat32((Rs.h[1] * Rt.h[1])[<<1]);
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
Rdd=vmpyh(Rs,Rt)[:<<N]:s
1 1 1 0 0 1 0 1 N 0 0 s s s s s P P 0 t t t t t 1 0 1 d d d d d at
1 1 1 0 0 1 1 1 N 0 0 s s s s s P P 0 t t t t t 1 0 1 x x x x x Rxx+=vmpyh(Rs,Rt)[:<<N]:
sat
Rd=vmpyh(Rs,Rt):rnd:sat
Rs
Rt
* *
32 32
0x8000 <<0-1 <<0-1 0x8000
Add Add
Sat32 Sat32
High 16 bits
High 16 bits
Rd
Syntax Behavior
Rd=vmpyh(Rs,Rt)[:<<1]:rnd:sa Rd.h[1]=(sat32((Rs.h[1] * Rt.h[1])[<<1] +
t 0x8000)).h[1];
Rd.h[0]=(sat32((Rs.h[0] * Rt.h[0])[<<1] +
0x8000)).h[1];
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
Rd=vmpyh(Rs,Rt)[:<<N]:rn
1 1 1 0 1 1 0 1 N 0 1 s s s s s P P 0 t t t t t 1 1 1 d d d d d
d:sat
Syntax Behavior
Rdd=vmpyhsu(Rs,Rt)[:<<1]:sat Rdd.w[0]=sat32((Rs.h[0] * Rt.uh[0])[<<1]);
Rdd.w[1]=sat32((Rs.h[1] * Rt.uh[1])[<<1]);
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
1 1 1 0 0 1 0 1 N 0 0 s s s s s P P 0 t t t t t 1 1 1 d d d d d Rdd=vmpyhsu(Rs,Rt)[:<<N]
:sat
ICLASS RegType MajOp s5 Parse t5 MinOp x5
1 1 1 0 0 1 1 1 N 1 1 s s s s s P P 0 t t t t t 1 0 1 x x x x x Rxx+=vmpyhsu(Rs,Rt)[:<<
N]:sat
Rtt
* * * *
32 32
32 32
Add
64
Rdd
64-bit register pair
Syntax Behavior
Rdd=vrmpyh(Rss,Rtt) Rdd = (Rss.h[0] * Rtt.h[0]) + (Rss.h[1] * Rtt.h[1]) +
(Rss.h[2] * Rtt.h[2]) + (Rss.h[3] * Rtt.h[3]);
Rxx+=vrmpyh(Rss,Rtt) Rxx = Rxx + (Rss.h[0] * Rtt.h[0]) + (Rss.h[1] *
Rtt.h[1]) + (Rss.h[2] * Rtt.h[2]) + (Rss.h[3] *
Rtt.h[3]);
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
1 1 1 0 1 0 0 0 0 0 0 s s s s s P P 0 t t t t t 0 1 0 d d d d d Rdd=vrmpyh(Rss,Rtt)
ICLASS RegType MajOp s5 Parse t5 MinOp x5
1 1 1 0 1 0 1 0 0 0 0 s s s s s P P 0 t t t t t 0 1 0 x x x x x Rxx+=vrmpyh(Rss,Rtt)
Rs
Rt
* * * *
Rxx
Syntax Behavior
Rdd=vmpybsu(Rs,Rt) Rdd.h[0]=((Rs.b[0] * Rt.ub[0]));
Rdd.h[1]=((Rs.b[1] * Rt.ub[1]));
Rdd.h[2]=((Rs.b[2] * Rt.ub[2]));
Rdd.h[3]=((Rs.b[3] * Rt.ub[3]));
Rdd=vmpybu(Rs,Rt) Rdd.h[0]=((Rs.ub[0] * Rt.ub[0]));
Rdd.h[1]=((Rs.ub[1] * Rt.ub[1]));
Rdd.h[2]=((Rs.ub[2] * Rt.ub[2]));
Rdd.h[3]=((Rs.ub[3] * Rt.ub[3]));
Rxx+=vmpybsu(Rs,Rt) Rxx.h[0]=(Rxx.h[0]+(Rs.b[0] * Rt.ub[0]));
Rxx.h[1]=(Rxx.h[1]+(Rs.b[1] * Rt.ub[1]));
Rxx.h[2]=(Rxx.h[2]+(Rs.b[2] * Rt.ub[2]));
Rxx.h[3]=(Rxx.h[3]+(Rs.b[3] * Rt.ub[3]));
Rxx+=vmpybu(Rs,Rt) Rxx.h[0]=(Rxx.h[0]+(Rs.ub[0] * Rt.ub[0]));
Rxx.h[1]=(Rxx.h[1]+(Rs.ub[1] * Rt.ub[1]));
Rxx.h[2]=(Rxx.h[2]+(Rs.ub[2] * Rt.ub[2]));
Rxx.h[3]=(Rxx.h[3]+(Rs.ub[3] * Rt.ub[3]));
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
1 1 1 0 0 1 0 1 0 1 0 s s s s s P P 0 t t t t t 0 0 1 d d d d d Rdd=vmpybsu(Rs,Rt)
1 1 1 0 0 1 0 1 1 0 0 s s s s s P P 0 t t t t t 0 0 1 d d d d d Rdd=vmpybu(Rs,Rt)
ICLASS RegType MajOp s5 Parse t5 MinOp x5
1 1 1 0 0 1 1 1 1 0 0 s s s s s P P 0 t t t t t 0 0 1 x x x x x Rxx+=vmpybu(Rs,Rt)
1 1 1 0 0 1 1 1 1 1 0 s s s s s P P 0 t t t t t 0 0 1 x x x x x Rxx+=vmpybsu(Rs,Rt)
Perform a vector 16 × 16 carryless polynomial multiply using 32-bit source registers Rs and Rt.
Store the 64-bit result in packed H,H,L,L format in the destination register. The destination
register can also be optionally accumulated (XORed). Finite field multiply instructions are useful
for many algorithms including scramble code generation, cryptographic algorithms,
convolutional, and Reed Solomon codes.
Rxx += vpmpyh(Rs,Rt)
Rs
Rt
16 x 16
carryless 16 x 16
polynomial
mpy * * carryless
polynomial
mpy
XOR XOR
Rxx
Syntax Behavior
Rdd=vpmpyh(Rs,Rt) x0 = Rs.uh[0];
x1 = Rs.uh[1];
y0 = Rt.uh[0];
y1 = Rt.uh[1];
prod0 = prod1 = 0;
for(i=0; i < 16; i++) {
if((y0 >> i) & 1) prod0 ^= (x0 << i);
if((y1 >> i) & 1) prod1 ^= (x1 << i);
}
Rdd.h[0]=prod0.uh[0];
Rdd.h[1]=prod1.uh[0];
Rdd.h[2]=prod0.uh[1];
Rdd.h[3]=prod1.uh[1];
Syntax Behavior
Rxx^=vpmpyh(Rs,Rt) x0 = Rs.uh[0];
x1 = Rs.uh[1];
y0 = Rt.uh[0];
y1 = Rt.uh[1];
prod0 = prod1 = 0;
for(i=0; i < 16; i++) {
if((y0 >> i) & 1) prod0 ^= (x0 << i);
if((y1 >> i) & 1) prod1 ^= (x1 << i);
}
Rxx.h[0]=Rxx.uh[0] ^ prod0.uh[0];
Rxx.h[1]=Rxx.uh[1] ^ prod1.uh[0];
Rxx.h[2]=Rxx.uh[2] ^ prod0.uh[1];
Rxx.h[3]=Rxx.uh[3] ^ prod1.uh[1];
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse t5 MinOp d5
1 1 1 0 0 1 0 1 1 1 0 s s s s s P P 0 t t t t t 1 1 1 d d d d d Rdd=vpmpyh(Rs,Rt)
ICLASS RegType MajOp s5 Parse t5 MinOp x5
1 1 1 0 0 1 1 1 1 0 1 s s s s s P P 0 t t t t t 1 1 1 x x x x x Rxx^=vpmpyh(Rs,Rt)
Notes
■ The predicate generated by this instruction cannot be used as a .new predicate, nor can it be
automatically ANDed with another predicate.
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType Maj s5 Parse t5 Min d5
1 1 0 0 0 0 0 1 1 1 - s s s s s P P - t t t t t 1 1 - d d d d d Rdd=decbin(Rss,Rtt)
Saturate
Saturate a single scalar value.
The sath instruction saturates a signed 32-bit number to a signed 16-bit number, which is sign-
extended back to 32 bits and placed in the destination register. The minimum negative value of
the result is 0xffff8000 and the maximum positive value is 0x00007fff.
The satuh instruction saturates a signed 32-bit number to an unsigned 16-bit number, which is
zero-extended back to 32 bits and placed in the destination register. The minimum value of the
result is 0 and the maximum value is 0x0000ffff.
The satb instruction saturates a signed 32-bit number to an signed 8-bit number, which is sign-
extended back to 32 bits and placed in the destination register. The minimum value of the result is
0xffffff80 and the maximum value is 0x0000007f.
The satub instruction saturates a signed 32-bit number to an unsigned 8-bit number, which is
zero-extended back to 32 bits and placed in the destination register. The minimum value of the
result is 0 and the maximum value is 0x000000ff.
Syntax Behavior
Rd=sat(Rss) Rd = sat32(Rss);
Rd=satb(Rs) Rd = sat8(Rs);
Rd=sath(Rs) Rd = sat16(Rs);
Rd=satub(Rs) Rd = usat8(Rs);
Rd=satuh(Rs) Rd = usat16(Rs);
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 1 0 0 0 1 1 0 s s s s s P P - - - - - - 0 0 0 d d d d d Rd=sat(Rss)
1 0 0 0 1 1 0 0 1 1 0 s s s s s P P - - - - - - 1 0 0 d d d d d Rd=sath(Rs)
1 0 0 0 1 1 0 0 1 1 0 s s s s s P P - - - - - - 1 0 1 d d d d d Rd=satuh(Rs)
1 0 0 0 1 1 0 0 1 1 0 s s s s s P P - - - - - - 1 1 0 d d d d d Rd=satub(Rs)
1 0 0 0 1 1 0 0 1 1 0 s s s s s P P - - - - - - 1 1 1 d d d d d Rd=satb(Rs)
Swizzle bytes
Swizzle the bytes of a word. This instruction is useful in converting between little and big endian
formats.
Rd=swiz(Rs)
Rs
Rd
Syntax Behavior
Rd=swiz(Rs) Rd.b[0]=Rs.b[3];
Rd.b[1]=Rs.b[2];
Rd.b[2]=Rs.b[1];
Rd.b[3]=Rs.b[0];
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 1 1 0 0 1 0 0 s s s s s P P - - - - - - 1 1 1 d d d d d Rd=swiz(Rs)
Vector align
Align a vector. Use the immediate amount, or the least significant three bits of a predicate register
as the number of bytes to align. Shift the Rss register pair right by this number of bytes. Fill the
vacated positions with the least significant elements from Rtt.
#u3/P
Rtt Rss
Rdd
Syntax Behavior
Rdd=valignb(Rtt,Rss,#u3) Rdd = (Rss >>> #u*8)|(Rtt << ((8-#u)*8));
Rdd=valignb(Rtt,Rss,Pu) PREDUSE_TIMING;
Rdd = Rss >>> (Pu&0x7)*8|(Rtt << (8-(Pu&0x7))*8);
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType Maj s5 Parse t5 Min d5
1 1 0 0 0 0 0 0 0 - - s s s s s P P - t t t t t i i i d d d d d Rdd=valignb(Rtt,Rss,#u3)
ICLASS RegType Maj s5 Parse t5 u2 d5
1 1 0 0 0 0 1 0 0 - - s s s s s P P - t t t t t - u u d d d d d Rdd=valignb(Rtt,Rss,Pu)
0x8000 0x8000
Rd.h[1] Rd.h[0] Rd
Syntax Behavior
Rd=vrndwh(Rss) for (i=0;i<2;i++) {
Rd.h[i]=(Rss.w[i]+0x08000).h[1];
}
Rd=vrndwh(Rss):sat for (i=0;i<2;i++) {
Rd.h[i]=sat32(Rss.w[i]+0x08000).h[1];
}
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 1 0 0 0 1 0 0 s s s s s P P - - - - - - 1 0 0 d d d d d Rd=vrndwh(Rss)
1 0 0 0 1 0 0 0 1 0 0 s s s s s P P - - - - - - 1 1 0 d d d d d Rd=vrndwh(Rss):sat
Rd=vsathub(Rss) Rd=vsathub(Rs)
Sat_u8 Sat_u8
Sat_u8 Sat_u8 Sat_u8 Sat_u8
u8 u8 u8 u8 Rd 0 0 u8 u8 Rd
Rd=vsathb(Rss) Rd=vsathb(Rs)
s16 s16 s16 s16 Rss s16 s16 Rs
s8 s8 s8 s8 Rd 0 0 s8 s8 Rd
Rd=vsathwh(Rss) Rd=vsathwuh(Rss)
s32 s32 Rss s32 s32 Rss
Syntax Behavior
Rd=vsathb(Rs) Rd.b[0]=sat8(Rs.h[0]);
Rd.b[1]=sat8(Rs.h[1]);
Rd.b[2]=0;
Rd.b[3]=0;
Rd=vsathb(Rss) for (i=0;i<4;i++) {
Rd.b[i]=sat8(Rss.h[i]);
}
Rd=vsathub(Rs) Rd.b[0]=usat8(Rs.h[0]);
Rd.b[1]=usat8(Rs.h[1]);
Rd.b[2]=0;
Rd.b[3]=0;
Rd=vsathub(Rss) for (i=0;i<4;i++) {
Rd.b[i]=usat8(Rss.h[i]);
}
Rd=vsatwh(Rss) for (i=0;i<2;i++) {
Rd.h[i]=sat16(Rss.w[i]);
}
Rd=vsatwuh(Rss) for (i=0;i<2;i++) {
Rd.h[i]=usat16(Rss.w[i]);
}
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 1 0 0 0 0 0 0 s s s s s P P - - - - - - 0 0 0 d d d d d Rd=vsathub(Rss)
1 0 0 0 1 0 0 0 0 0 0 s s s s s P P - - - - - - 0 1 0 d d d d d Rd=vsatwh(Rss)
1 0 0 0 1 0 0 0 0 0 0 s s s s s P P - - - - - - 1 0 0 d d d d d Rd=vsatwuh(Rss)
1 0 0 0 1 0 0 0 0 0 0 s s s s s P P - - - - - - 1 1 0 d d d d d Rd=vsathb(Rss)
1 0 0 0 1 1 0 0 1 0 - s s s s s P P - - - - - - 0 0 - d d d d d Rd=vsathb(Rs)
1 0 0 0 1 1 0 0 1 0 - s s s s s P P - - - - - - 0 1 - d d d d d Rd=vsathub(Rs)
0 u8 0 u8 0 u8 0 u8 Rdd
Rdd=vsathb(Rss)
se se se se
s8 s8 s8 s8 Rdd
Syntax Behavior
Rdd=vsathb(Rss) for (i=0;i<4;i++) {
Rdd.h[i]=sat8(Rss.h[i]);
}
Rdd=vsathub(Rss) for (i=0;i<4;i++) {
Rdd.h[i]=usat8(Rss.h[i]);
}
Rdd=vsatwh(Rss) for (i=0;i<2;i++) {
Rdd.w[i]=sat16(Rss.w[i]);
}
Rdd=vsatwuh(Rss) for (i=0;i<2;i++) {
Rdd.w[i]=usat16(Rss.w[i]);
}
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 0 0 0 0 0 0 0 s s s s s P P - - - - - - 1 0 0 d d d d d Rdd=vsathub(Rss)
1 0 0 0 0 0 0 0 0 0 0 s s s s s P P - - - - - - 1 0 1 d d d d d Rdd=vsatwuh(Rss)
1 0 0 0 0 0 0 0 0 0 0 s s s s s P P - - - - - - 1 1 0 d d d d d Rdd=vsatwh(Rss)
1 0 0 0 0 0 0 0 0 0 0 s s s s s P P - - - - - - 1 1 1 d d d d d Rdd=vsathb(Rss)
Vector shuffle
Shuffle odd halfwords (shuffoh) takes the odd halfwords from Rtt and the odd halfwords from Rss
and merges them together into vector Rdd. Shuffle even halfwords (shuffeh) performs the same
operation on every even halfword in Rss and Rtt. The same operation is available for odd and
even bytes.
shuffoh shuffeh
Rtt Rss
Rss Rtt
Rdd Rdd
shuffob shuffeb
Rtt Rss
Rss Rtt
Rdd Rdd
Syntax Behavior
Rdd=shuffeb(Rss,Rtt) for (i=0;i<4;i++) {
Rdd.b[i*2]=Rtt.b[i*2];
Rdd.b[i*2+1]=Rss.b[i*2];
}
Rdd=shuffeh(Rss,Rtt) for (i=0;i<2;i++) {
Rdd.h[i*2]=Rtt.h[i*2];
Rdd.h[i*2+1]=Rss.h[i*2];
}
Rdd=shuffob(Rtt,Rss) for (i=0;i<4;i++) {
Rdd.b[i*2]=Rss.b[i*2+1];
Rdd.b[i*2+1]=Rtt.b[i*2+1];
}
Rdd=shuffoh(Rtt,Rss) for (i=0;i<2;i++) {
Rdd.h[i*2]=Rss.h[i*2+1];
Rdd.h[i*2+1]=Rtt.h[i*2+1];
}
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType Maj s5 Parse t5 Min d5
1 1 0 0 0 0 0 1 0 0 - s s s s s P P - t t t t t 0 1 - d d d d d Rdd=shuffeb(Rss,Rtt)
1 1 0 0 0 0 0 1 0 0 - s s s s s P P - t t t t t 1 0 - d d d d d Rdd=shuffob(Rtt,Rss)
1 1 0 0 0 0 0 1 0 0 - s s s s s P P - t t t t t 1 1 - d d d d d Rdd=shuffeh(Rss,Rtt)
1 1 0 0 0 0 0 1 1 0 - s s s s s P P - t t t t t 0 0 0 d d d d d Rdd=shuffoh(Rtt,Rss)
Rd=vsplatb(Rs)
Rs
Rd
Syntax Behavior
Rd=vsplatb(Rs) for (i=0;i<4;i++) {
Rd.b[i]=Rs.b[0];
}
Rdd=vsplatb(Rs) for (i=0;i<8;i++) {
Rdd.b[i]=Rs.b[0];
}
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 0 1 0 0 0 1 - s s s s s P P - - - - - - 1 0 - d d d d d Rdd=vsplatb(Rs)
1 0 0 0 1 1 0 0 0 1 0 s s s s s P P - - - - - - 1 1 1 d d d d d Rd=vsplatb(Rs)
Rdd=vsplath(Rs)
Rs
Rdd
Syntax Behavior
Rdd=vsplath(Rs) for (i=0;i<4;i++) {
Rdd.h[i]=Rs.h[0];
}
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 0 1 0 0 0 1 - s s s s s P P - - - - - - 0 1 - d d d d d Rdd=vsplath(Rs)
Vector splice
Concatenate the low (8-N) bytes of vector Rtt with the low N bytes of vector Rss. This instruction
is helpful to vectorize unaligned stores.
#u3/P
Rtt Rss
Rdd
Syntax Behavior
Rdd=vspliceb(Rss,Rtt,#u3) Rdd = Rtt << #u*8 | zxt#u*8->64(Rss);
Rdd=vspliceb(Rss,Rtt,Pu) PREDUSE_TIMING;
Rdd = Rtt << (Pu&7)*8 | zxt(Pu&7)*8->64(Rss);
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType Maj s5 Parse t5 Min d5
1 1 0 0 0 0 0 0 1 - - s s s s s P P - t t t t t i i i d d d d d Rdd=vspliceb(Rss,Rtt,#u3)
ICLASS RegType Maj s5 Parse t5 u2 d5
1 1 0 0 0 0 1 0 1 0 0 s s s s s P P - t t t t t - u u d d d d d Rdd=vspliceb(Rss,Rtt,Pu)
Rdd=vsxtbh(Rs) Rs
Rs
Rdd=vsxthw(Rs)
Syntax Behavior
Rdd=vsxtbh(Rs) for (i=0;i<4;i++) {
Rdd.h[i]=Rs.b[i];
}
Rdd=vsxthw(Rs) for (i=0;i<2;i++) {
Rdd.w[i]=Rs.h[i];
}
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 0 1 0 0 0 0 - s s s s s P P - - - - - - 0 0 - d d d d d Rdd=vsxtbh(Rs)
1 0 0 0 0 1 0 0 0 0 - s s s s s P P - - - - - - 1 0 - d d d d d Rdd=vsxthw(Rs)
Vector truncate
In the vtrunehb instruction, for each halfword in a vector, take the even (lower) byte and ignore
the other byte. Pack the resulting values into destination register Rd.
The vtrunohb instruction takes each odd byte of the source vector.
The vtrunewh instruction uses two source register pairs, Rss and Rtt. Pack the even (lower)
halfwords of Rss in the upper word of Rdd, and pack the lower halfwords of Rtt in the lower word
of Rdd.
The vtrunowh instruction performs the same operation as the vtrunewh instruction, but uses the
odd (upper) halfwords of the source vectors instead.
Rd=vtrunehb(Rss) Rdd=vtrunewh(Rss,Rtt)
Rss
Rss
Rtt
Rd
Rdd
Rd=vtrunohb(Rss)
Rdd=vtrunowh(Rss,Rtt)
Rss Rss
Rtt
Rd
Rdd
Syntax Behavior
Rd=vtrunehb(Rss) for (i=0;i<4;i++) {
Rd.b[i]=Rss.b[i*2];
}
Rd=vtrunohb(Rss) for (i=0;i<4;i++) {
Rd.b[i]=Rss.b[i*2+1];
}
Rdd=vtrunehb(Rss,Rtt) for (i=0;i<4;i++) {
Rdd.b[i]=Rtt.b[i*2];
Rdd.b[i+4]=Rss.b[i*2];
}
Rdd=vtrunewh(Rss,Rtt) Rdd.h[0]=Rtt.h[0];
Rdd.h[1]=Rtt.h[2];
Rdd.h[2]=Rss.h[0];
Rdd.h[3]=Rss.h[2];
Rdd=vtrunohb(Rss,Rtt) for (i=0;i<4;i++) {
Rdd.b[i]=Rtt.b[i*2+1];
Rdd.b[i+4]=Rss.b[i*2+1];
}
Syntax Behavior
Rdd=vtrunowh(Rss,Rtt) Rdd.h[0]=Rtt.h[1];
Rdd.h[1]=Rtt.h[3];
Rdd.h[2]=Rss.h[1];
Rdd.h[3]=Rss.h[3];
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 1 0 0 0 1 0 0 s s s s s P P - - - - - - 0 0 0 d d d d d Rd=vtrunohb(Rss)
1 0 0 0 1 0 0 0 1 0 0 s s s s s P P - - - - - - 0 1 0 d d d d d Rd=vtrunehb(Rss)
ICLASS RegType Maj s5 Parse t5 Min d5
1 1 0 0 0 0 0 1 1 0 - s s s s s P P - t t t t t 0 1 0 d d d d d Rdd=vtrunewh(Rss,Rtt)
1 1 0 0 0 0 0 1 1 0 - s s s s s P P - t t t t t 0 1 1 d d d d d Rdd=vtrunehb(Rss,Rtt)
1 1 0 0 0 0 0 1 1 0 - s s s s s P P - t t t t t 1 0 0 d d d d d Rdd=vtrunowh(Rss,Rtt)
1 1 0 0 0 0 0 1 1 0 - s s s s s P P - t t t t t 1 0 1 d d d d d Rdd=vtrunohb(Rss,Rtt)
Rdd=vzxtbh(Rs) Rs
Rdd=vzxthw(Rs) Rs
Syntax Behavior
Rdd=vzxtbh(Rs) for (i=0;i<4;i++) {
Rdd.h[i]=Rs.ub[i];
}
Rdd=vzxthw(Rs) for (i=0;i<2;i++) {
Rdd.w[i]=Rs.uh[i];
}
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 0 1 0 0 0 0 - s s s s s P P - - - - - - 0 1 - d d d d d Rdd=vzxtbh(Rs)
1 0 0 0 0 1 0 0 0 0 - s s s s s P P - - - - - - 1 1 - d d d d d Rdd=vzxthw(Rs)
Bounds check
Determine if Rs falls in the range defined by Rtt.
The user sets Rtt.w0 to the lower bound, and Rtt.w1 to the upper bound.
All bits of the destination predicate are set if the value falls within the range, or all cleared
otherwise.
Syntax Behavior
Pd=boundscheck(Rs,Rtt) if ("Rs & 1") {
Assembler mapped to:
"Pd=boundscheck(Rss,Rtt):raw:hi";
} else {
Assembler mapped to:
"Pd=boundscheck(Rss,Rtt):raw:lo";
}
Pd=boundscheck(Rss,Rtt):raw: src = Rss.uw[1];
hi Pd = (src.uw[0] >= Rtt.uw[0]) && (src.uw[0] <
Rtt.uw[1]) ? 0xff : 0x00;
Pd=boundscheck(Rss,Rtt):raw: src = Rss.uw[0];
lo Pd = (src.uw[0] >= Rtt.uw[0]) && (src.uw[0] <
Rtt.uw[1]) ? 0xff : 0x00;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 MinOp d2
1 1 0 1 0 0 1 0 0 - - s s s s s P P 1 t t t t t 1 0 0 - - - d d Pd=boundscheck(Rss,Rtt):r
aw:lo
1 1 0 1 0 0 1 0 0 - - s s s s s P P 1 t t t t t 1 0 1 - - - d d Pd=boundscheck(Rss,Rtt):r
aw:hi
Compare byte
These instructions sign- or zero-extend the low 8 bits of the source registers and perform 32-bit
comparisons on the result. When there is an extended 32-bit immediate operand, the full 32
immediate bits are used for the comparison.
Syntax Behavior
Pd=cmpb.eq(Rs,#u8) Pd=Rs.ub[0] == #u ? 0xff : 0x00;
Pd=cmpb.eq(Rs,Rt) Pd=Rs.b[0] == Rt.b[0] ? 0xff : 0x00;
Pd=cmpb.gt(Rs,#s8) Pd=Rs.b[0] > #s ? 0xff : 0x00;
Pd=cmpb.gt(Rs,Rt) Pd=Rs.b[0] > Rt.b[0] ? 0xff : 0x00;
Pd=cmpb.gtu(Rs,#u7) apply_extension(#u);
Pd=Rs.ub[0] > #u.uw[0] ? 0xff : 0x00;
Pd=cmpb.gtu(Rs,Rt) Pd=Rs.ub[0] > Rt.ub[0] ? 0xff : 0x00;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType Maj s5 Parse t5 Min d2
1 1 0 0 0 1 1 1 1 1 0 s s s s s P P - t t t t t 0 1 0 - - - d d Pd=cmpb.gt(Rs,Rt)
1 1 0 0 0 1 1 1 1 1 0 s s s s s P P - t t t t t 1 1 0 - - - d d Pd=cmpb.eq(Rs,Rt)
1 1 0 0 0 1 1 1 1 1 0 s s s s s P P - t t t t t 1 1 1 - - - d d Pd=cmpb.gtu(Rs,Rt)
ICLASS RegType s5 Parse d2
1 1 0 1 1 1 0 1 - 0 0 s s s s s P P - i i i i i i i i 0 0 - d d Pd=cmpb.eq(Rs,#u8)
1 1 0 1 1 1 0 1 - 0 1 s s s s s P P - i i i i i i i i 0 0 - d d Pd=cmpb.gt(Rs,#s8)
1 1 0 1 1 1 0 1 - 1 0 s s s s s P P - 0 i i i i i i i 0 0 - d d Pd=cmpb.gtu(Rs,#u7)
Compare half
These instructions sign- or zero-extend the low 16 bits of the source registers and perform 32-bit
comparisons on the result. When there is an extended 32-bit immediate operand, the full 32
immediate bits are used for the comparison.
Syntax Behavior
Pd=cmph.eq(Rs,#s8) apply_extension(#s);
Pd=Rs.h[0] == #s ? 0xff : 0x00;
Pd=cmph.eq(Rs,Rt) Pd=Rs.h[0] == Rt.h[0] ? 0xff : 0x00;
Pd=cmph.gt(Rs,#s8) apply_extension(#s);
Pd=Rs.h[0] > #s ? 0xff : 0x00;
Pd=cmph.gt(Rs,Rt) Pd=Rs.h[0] > Rt.h[0] ? 0xff : 0x00;
Pd=cmph.gtu(Rs,#u7) apply_extension(#u);
Pd=Rs.uh[0] > #u.uw[0] ? 0xff : 0x00;
Pd=cmph.gtu(Rs,Rt) Pd=Rs.uh[0] > Rt.uh[0] ? 0xff : 0x00;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType Maj s5 Parse t5 Min d2
1 1 0 0 0 1 1 1 1 1 0 s s s s s P P - t t t t t 0 1 1 - - - d d Pd=cmph.eq(Rs,Rt)
1 1 0 0 0 1 1 1 1 1 0 s s s s s P P - t t t t t 1 0 0 - - - d d Pd=cmph.gt(Rs,Rt)
1 1 0 0 0 1 1 1 1 1 0 s s s s s P P - t t t t t 1 0 1 - - - d d Pd=cmph.gtu(Rs,Rt)
ICLASS RegType s5 Parse d2
1 1 0 1 1 1 0 1 - 0 0 s s s s s P P - i i i i i i i i 0 1 - d d Pd=cmph.eq(Rs,#s8)
1 1 0 1 1 1 0 1 - 0 1 s s s s s P P - i i i i i i i i 0 1 - d d Pd=cmph.gt(Rs,#s8)
1 1 0 1 1 1 0 1 - 1 0 s s s s s P P - 0 i i i i i i i 0 1 - d d Pd=cmph.gtu(Rs,#u7)
Compare doublewords
Compare two 64-bit register pairs for unsigned greater than, greater than, or equal. The 8-bit
predicate register Pd is set to all 1s or all 0s, depending on the result.
Syntax Behavior
Pd=cmp.eq(Rss,Rtt) Pd=Rss==Rtt ? 0xff : 0x00;
Pd=cmp.gt(Rss,Rtt) Pd=Rss>Rtt ? 0xff : 0x00;
Pd=cmp.gtu(Rss,Rtt) Pd=Rss.u64>Rtt.u64 ? 0xff : 0x00;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 MinOp d2
1 1 0 1 0 0 1 0 1 0 0 s s s s s P P - t t t t t 0 0 0 - - - d d Pd=cmp.eq(Rss,Rtt)
1 1 0 1 0 0 1 0 1 0 0 s s s s s P P - t t t t t 0 1 0 - - - d d Pd=cmp.gt(Rss,Rtt)
1 1 0 1 0 0 1 0 1 0 0 s s s s s P P - t t t t t 1 0 0 - - - d d Pd=cmp.gtu(Rss,Rtt)
Syntax Behavior
Pd=[!]bitsclr(Rs,#u6) Pd=(Rs&#u)[!]=0 ? 0xff : 0x00;
Pd=[!]bitsclr(Rs,Rt) Pd=(Rs&Rt)[!]=0 ? 0xff : 0x00;
Pd=[!]bitsset(Rs,Rt) Pd=(Rs&Rt)[!]=Rt ? 0xff : 0x00;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse d2
1 0 0 0 0 1 0 1 1 0 0 s s s s s P P i i i i i i - - - - - - d d Pd=bitsclr(Rs,#u6)
1 0 0 0 0 1 0 1 1 0 1 s s s s s P P i i i i i i - - - - - - d d Pd=!bitsclr(Rs,#u6)
ICLASS RegType Maj s5 Parse t5 d2
1 1 0 0 0 1 1 1 0 1 0 s s s s s P P - t t t t t - - - - - - d d Pd=bitsset(Rs,Rt)
1 1 0 0 0 1 1 1 0 1 1 s s s s s P P - t t t t t - - - - - - d d Pd=!bitsset(Rs,Rt)
1 1 0 0 0 1 1 1 1 0 0 s s s s s P P - t t t t t - - - - - - d d Pd=bitsclr(Rs,Rt)
1 1 0 0 0 1 1 1 1 0 1 s s s s s P P - t t t t t - - - - - - d d Pd=!bitsclr(Rs,Rt)
7 0
1 0 1 0 1 0 1 0 Pt
Syntax Behavior
Rdd=mask(Pt) PREDUSE_TIMING;
for (i = 0; i < 8; i++) {
Rdd.b[i]=(Pt.i?(0xff):(0x00));
}
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType Parse t2 d5
1 0 0 0 0 1 1 0 - - - - - - - - P P - - - - t t - - - d d d d d Rdd=mask(Pt)
Syntax Behavior
Pd=tlbmatch(Rss,Rt) MASK = 0x07ffffff;
TLBLO = Rss.uw[0];
TLBHI = Rss.uw[1];
SIZE = min(6,count_leading_ones(~reverse_bits(TLBLO)));
MASK &= (0xffffffff << 2*SIZE);
Pd = TLBHI.31 && ((TLBHI & MASK) == (Rt & MASK)) ? 0xff
: 0x00;
Notes
■ The predicate generated by this instruction cannot be used as a .new predicate, nor can it be
automatically ANDed with another predicate.
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 MinOp d2
1 1 0 1 0 0 1 0 0 - - s s s s s P P 1 t t t t t 0 1 1 - - - d d Pd=tlbmatch(Rss,Rt)
Predicate transfer
Pd=Rs transfers a predicate to the eight least-significant bits of a general register and zeros the
other bits.
Rd=Ps transfers the eight least-significant bits of a general register to a predicate.
Syntax Behavior
Pd=Rs Pd = Rs.ub[0];
Rd=Ps PREDUSE_TIMING;
Rd = zxt8->32(Ps);
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse d2
1 0 0 0 0 1 0 1 0 1 0 s s s s s P P - - - - - - - - - - - - d d Pd=Rs
ICLASS RegType MajOp s2 Parse d5
1 0 0 0 1 0 0 1 - 1 - - - - s s P P - - - - - - - - - d d d d d Rd=Ps
Test bit
Extract a bit from a register. If the bit is true (1), set all the bits of the predicate register
destination to 1. If the bit is false (0), set all the bits of the predicate register destination to 0. The
bit to test can be indicated using an immediate or register value.
If a register is used to indicate the bit to test, and the value specified is out of range, the predicate
result is zero.
Syntax Behavior
Pd=[!]tstbit(Rs,#u5) Pd = (Rs & (1<<#u)) == 0 ? 0xff : 0x00;
Pd=[!]tstbit(Rs,Rt) Pd = (zxt32->64(Rs) & (sxt7->32(Rt)>0)?(zxt32->64(1)<<sxt7-
>32(Rt)):(zxt32->64(1)>>>sxt7->32(Rt)))== 0 ? 0xff : 0x00;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse d2
1 0 0 0 0 1 0 1 0 0 0 s s s s s P P 0 i i i i i - - - - - - d d Pd=tstbit(Rs,#u5)
1 0 0 0 0 1 0 1 0 0 1 s s s s s P P 0 i i i i i - - - - - - d d Pd=!tstbit(Rs,#u5)
ICLASS RegType Maj s5 Parse t5 d2
1 1 0 0 0 1 1 1 0 0 0 s s s s s P P - t t t t t - - - - - - d d Pd=tstbit(Rs,Rt)
1 1 0 0 0 1 1 1 0 0 1 s s s s s P P - t t t t t - - - - - - d d Pd=!tstbit(Rs,Rt)
Rss
Rtt
1 1 0 0 1 1 0 0 Pd
7 0
Syntax Behavior
Pd=vcmph.eq(Rss,#s8) for (i = 0; i < 4; i++) {
Pd.i*2 = (Rss.h[i] == #s);
Pd.i*2+1 = (Rss.h[i] == #s);
}
Pd=vcmph.eq(Rss,Rtt) for (i = 0; i < 4; i++) {
Pd.i*2 = (Rss.h[i] == Rtt.h[i]);
Pd.i*2+1 = (Rss.h[i] == Rtt.h[i]);
}
Pd=vcmph.gt(Rss,#s8) for (i = 0; i < 4; i++) {
Pd.i*2 = (Rss.h[i] > #s);
Pd.i*2+1 = (Rss.h[i] > #s);
}
Pd=vcmph.gt(Rss,Rtt) for (i = 0; i < 4; i++) {
Pd.i*2 = (Rss.h[i] > Rtt.h[i]);
Pd.i*2+1 = (Rss.h[i] > Rtt.h[i]);
}
Pd=vcmph.gtu(Rss,#u7) for (i = 0; i < 4; i++) {
Pd.i*2 = (Rss.uh[i] > #u);
Pd.i*2+1 = (Rss.uh[i] > #u);
}
Pd=vcmph.gtu(Rss,Rtt) for (i = 0; i < 4; i++) {
Pd.i*2 = (Rss.uh[i] > Rtt.uh[i]);
Pd.i*2+1 = (Rss.uh[i] > Rtt.uh[i]);
}
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 MinOp d2
1 1 0 1 0 0 1 0 0 - - s s s s s P P 0 t t t t t 0 1 1 - - - d d Pd=vcmph.eq(Rss,Rtt)
1 1 0 1 0 0 1 0 0 - - s s s s s P P 0 t t t t t 1 0 0 - - - d d Pd=vcmph.gt(Rss,Rtt)
1 1 0 1 0 0 1 0 0 - - s s s s s P P 0 t t t t t 1 0 1 - - - d d Pd=vcmph.gtu(Rss,Rtt)
ICLASS RegType s5 Parse d2
1 1 0 1 1 1 0 0 0 0 0 s s s s s P P - i i i i i i i i 0 1 - d d Pd=vcmph.eq(Rss,#s8)
1 1 0 1 1 1 0 0 0 0 1 s s s s s P P - i i i i i i i i 0 1 - d d Pd=vcmph.gt(Rss,#s8)
1 1 0 1 1 1 0 0 0 1 0 s s s s s P P - 0 i i i i i i i 0 1 - d d Pd=vcmph.gtu(Rss,#u7)
Syntax Behavior
Pd=!any8(vcmpb.eq(Rss,Rtt)) Pd = 0;
for (i = 0; i < 8; i++) {
if (Rss.b[i] == Rtt.b[i]) Pd = 0xff;
}
Pd = ~Pd;
Pd=any8(vcmpb.eq(Rss,Rtt)) Pd = 0;
for (i = 0; i < 8; i++) {
if (Rss.b[i] == Rtt.b[i]) Pd = 0xff;
}
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 MinOp d2
1 1 0 1 0 0 1 0 0 - - s s s s s P P 1 t t t t t 0 0 0 - - - d d Pd=any8(vcmpb.eq(Rss,Rtt
))
1 1 0 1 0 0 1 0 0 - - s s s s s P P 1 t t t t t 0 0 1 - - - d d Pd=!any8(vcmpb.eq(Rss,Rt
t))
Rss
Rtt
1 0 1 0 1 0 1 0 Pd
7 0
Syntax Behavior
Pd=vcmpb.eq(Rss,#u8) for (i = 0; i < 8; i++) {
Pd.i = (Rss.ub[i] == #u);
}
Pd=vcmpb.eq(Rss,Rtt) for (i = 0; i < 8; i++) {
Pd.i = (Rss.b[i] == Rtt.b[i]);
}
Pd=vcmpb.gt(Rss,#s8) for (i = 0; i < 8; i++) {
Pd.i = (Rss.b[i] > #s);
}
Pd=vcmpb.gt(Rss,Rtt) for (i = 0; i < 8; i++) {
Pd.i = (Rss.b[i] > Rtt.b[i]);
}
Pd=vcmpb.gtu(Rss,#u7) for (i = 0; i < 8; i++) {
Pd.i = (Rss.ub[i] > #u);
}
Pd=vcmpb.gtu(Rss,Rtt) for (i = 0; i < 8; i++) {
Pd.i = (Rss.ub[i] > Rtt.ub[i]);
}
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 MinOp d2
1 1 0 1 0 0 1 0 0 - - s s s s s P P 0 t t t t t 1 1 0 - - - d d Pd=vcmpb.eq(Rss,Rtt)
1 1 0 1 0 0 1 0 0 - - s s s s s P P 0 t t t t t 1 1 1 - - - d d Pd=vcmpb.gtu(Rss,Rtt)
1 1 0 1 0 0 1 0 0 - - s s s s s P P 1 t t t t t 0 1 0 - - - d d Pd=vcmpb.gt(Rss,Rtt)
ICLASS RegType s5 Parse d2
1 1 0 1 1 1 0 0 0 0 0 s s s s s P P - i i i i i i i i 0 0 - d d Pd=vcmpb.eq(Rss,#u8)
1 1 0 1 1 1 0 0 0 0 1 s s s s s P P - i i i i i i i i 0 0 - d d Pd=vcmpb.gt(Rss,#s8)
1 1 0 1 1 1 0 0 0 1 0 s s s s s P P - 0 i i i i i i i 0 0 - d d Pd=vcmpb.gtu(Rss,#u7)
Rss
Rtt
cmp cmp
1 1 1 1 0 0 0 0 Pd
7 0
Syntax Behavior
Pd=vcmpw.eq(Rss,#s8) Pd[3:0] = (Rss.w[0]==#s);
Pd[7:4] = (Rss.w[1]==#s);
Pd=vcmpw.eq(Rss,Rtt) Pd[3:0] = (Rss.w[0]==Rtt.w[0]);
Pd[7:4] = (Rss.w[1]==Rtt.w[1]);
Pd=vcmpw.gt(Rss,#s8) Pd[3:0] = (Rss.w[0]>#s);
Pd[7:4] = (Rss.w[1]>#s);
Pd=vcmpw.gt(Rss,Rtt) Pd[3:0] = (Rss.w[0]>Rtt.w[0]);
Pd[7:4] = (Rss.w[1]>Rtt.w[1]);
Pd=vcmpw.gtu(Rss,#u7) Pd[3:0] = (Rss.uw[0]>#u.uw[0]);
Pd[7:4] = (Rss.uw[1]>#u.uw[0]);
Pd=vcmpw.gtu(Rss,Rtt) Pd[3:0] = (Rss.uw[0]>Rtt.uw[0]);
Pd[7:4] = (Rss.uw[1]>Rtt.uw[1]);
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 MinOp d2
1 1 0 1 0 0 1 0 0 - - s s s s s P P 0 t t t t t 0 0 0 - - - d d Pd=vcmpw.eq(Rss,Rtt)
1 1 0 1 0 0 1 0 0 - - s s s s s P P 0 t t t t t 0 0 1 - - - d d Pd=vcmpw.gt(Rss,Rtt)
1 1 0 1 0 0 1 0 0 - - s s s s s P P 0 t t t t t 0 1 0 - - - d d Pd=vcmpw.gtu(Rss,Rtt)
ICLASS RegType s5 Parse d2
1 1 0 1 1 1 0 0 0 0 0 s s s s s P P - i i i i i i i i 1 0 - d d Pd=vcmpw.eq(Rss,#s8)
1 1 0 1 1 1 0 0 0 0 1 s s s s s P P - i i i i i i i i 1 0 - d d Pd=vcmpw.gt(Rss,#s8)
1 1 0 1 1 1 0 0 0 1 0 s s s s s P P - 0 i i i i i i i 1 0 - d d Pd=vcmpw.gtu(Rss,#u7)
7 0
Ps
Pt
0 Rd
31 8 7 0
Syntax Behavior
Rd=vitpack(Ps,Pt) PREDUSE_TIMING;
Rd = (Ps&0x55) | (Pt&0xAA);
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s2 Parse t2 d5
1 0 0 0 1 0 0 1 - 0 0 - - - s s P P - - - - t t - - - d d d d d Rd=vitpack(Ps,Pt)
Vector mux
Perform an element-wise byte selection between two vectors.
For each of the low eight bits of predicate register Pu, if the bit is set, the corresponding byte in
Rdd is set to the corresponding byte from Rss. Otherwise, set the byte in Rdd to the byte from Rtt.
Rss
Rtt
Rdd
Syntax Behavior
Rdd=vmux(Pu,Rss,Rtt) PREDUSE_TIMING;
for (i = 0; i < 8; i++) {
Rdd.b[i]=(Pu.i?(Rss.b[i]):(Rtt.b[i]));
}
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType s5 Parse t5 u2 d5
1 1 0 1 0 0 0 1 - - - s s s s s P P - t t t t t - u u d d d d d Rdd=vmux(Pu,Rss,Rtt)
Syntax Behavior
Rd=mask(#u5,#U5) Rd = ((1<<#u)-1) << #U;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp Parse MinOp d5
1 0 0 0 1 1 0 1 0 I I - - - - - P P 1 i i i i i I I I d d d d d Rd=mask(#u5,#U5)
Shift by immediate
Shift the source register value right or left based on the type of instruction. In these instructions,
the shift amount is contained in an unsigned immediate (five bits for 32-bit shifts, six bits for 64-
bit shifts) and the shift instruction gives the shift direction.
Arithmetic right shifts place the sign bit of the source value in the vacated positions.
Logical right shifts place zeros in the vacated positions.
Left shifts always zero-fill the vacated bits.
ASR LSR
Lost Rs Lost Rs
Sign-ext Rd Zero-fill Rd
ASL
Lost Rs
Zero-fill Rd
Syntax Behavior
Rd=asl(Rs,#u5) Rd = Rs << #u;
Rd=asr(Rs,#u5) Rd = Rs >> #u;
Rd=lsr(Rs,#u5) Rd = Rs >>> #u;
Rd=rol(Rs,#u5) Rd = Rs <<R #u;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 0 0 0 0 0 0 0 s s s s s P P i i i i i i 0 0 0 d d d d d Rdd=asr(Rss,#u6)
1 0 0 0 0 0 0 0 0 0 0 s s s s s P P i i i i i i 0 0 1 d d d d d Rdd=lsr(Rss,#u6)
1 0 0 0 0 0 0 0 0 0 0 s s s s s P P i i i i i i 0 1 0 d d d d d Rdd=asl(Rss,#u6)
1 0 0 0 0 0 0 0 0 0 0 s s s s s P P i i i i i i 0 1 1 d d d d d Rdd=rol(Rss,#u6)
1 0 0 0 1 1 0 0 0 0 0 s s s s s P P 0 i i i i i 0 0 0 d d d d d Rd=asr(Rs,#u5)
1 0 0 0 1 1 0 0 0 0 0 s s s s s P P 0 i i i i i 0 0 1 d d d d d Rd=lsr(Rs,#u5)
1 0 0 0 1 1 0 0 0 0 0 s s s s s P P 0 i i i i i 0 1 0 d d d d d Rd=asl(Rs,#u5)
1 0 0 0 1 1 0 0 0 0 0 s s s s s P P 0 i i i i i 0 1 1 d d d d d Rd=rol(Rs,#u5)
Rss # / Rt Rs # / Rt
64-bit shift value Shift amount 32-bit shift value Shift amount
Syntax Behavior
Rx=add(#u8,asl(Rx,#U5)) Rx=apply_extension(#u)+(Rx<<#U);
Rx=add(#u8,lsr(Rx,#U5)) Rx=apply_extension(#u)+(((unsigned
int)Rx)>>#U);
Rx=sub(#u8,asl(Rx,#U5)) Rx=apply_extension(#u)-(Rx<<#U);
Rx=sub(#u8,lsr(Rx,#U5)) Rx=apply_extension(#u)-(((unsigned
int)Rx)>>#U);
Rx[+-]=asl(Rs,#u5) Rx = Rx [+-] Rs << #u;
Rx[+-]=asr(Rs,#u5) Rx = Rx [+-] Rs >> #u;
Rx[+-]=lsr(Rs,#u5) Rx = Rx [+-] Rs >>> #u;
Rx[+-]=rol(Rs,#u5) Rx = Rx [+-] Rs <<R #u;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp x5
1 0 0 0 0 0 1 0 0 0 - s s s s s P P i i i i i i 0 0 0 x x x x x Rxx-=asr(Rss,#u6)
1 0 0 0 0 0 1 0 0 0 - s s s s s P P i i i i i i 0 0 1 x x x x x Rxx-=lsr(Rss,#u6)
1 0 0 0 0 0 1 0 0 0 - s s s s s P P i i i i i i 0 1 0 x x x x x Rxx-=asl(Rss,#u6)
1 0 0 0 0 0 1 0 0 0 - s s s s s P P i i i i i i 0 1 1 x x x x x Rxx-=rol(Rss,#u6)
1 0 0 0 0 0 1 0 0 0 - s s s s s P P i i i i i i 1 0 0 x x x x x Rxx+=asr(Rss,#u6)
1 0 0 0 0 0 1 0 0 0 - s s s s s P P i i i i i i 1 0 1 x x x x x Rxx+=lsr(Rss,#u6)
1 0 0 0 0 0 1 0 0 0 - s s s s s P P i i i i i i 1 1 0 x x x x x Rxx+=asl(Rss,#u6)
1 0 0 0 0 0 1 0 0 0 - s s s s s P P i i i i i i 1 1 1 x x x x x Rxx+=rol(Rss,#u6)
1 0 0 0 1 1 1 0 0 0 - s s s s s P P 0 i i i i i 0 0 0 x x x x x Rx-=asr(Rs,#u5)
1 0 0 0 1 1 1 0 0 0 - s s s s s P P 0 i i i i i 0 0 1 x x x x x Rx-=lsr(Rs,#u5)
1 0 0 0 1 1 1 0 0 0 - s s s s s P P 0 i i i i i 0 1 0 x x x x x Rx-=asl(Rs,#u5)
1 0 0 0 1 1 1 0 0 0 - s s s s s P P 0 i i i i i 0 1 1 x x x x x Rx-=rol(Rs,#u5)
1 0 0 0 1 1 1 0 0 0 - s s s s s P P 0 i i i i i 1 0 0 x x x x x Rx+=asr(Rs,#u5)
1 0 0 0 1 1 1 0 0 0 - s s s s s P P 0 i i i i i 1 0 1 x x x x x Rx+=lsr(Rs,#u5)
1 0 0 0 1 1 1 0 0 0 - s s s s s P P 0 i i i i i 1 1 0 x x x x x Rx+=asl(Rs,#u5)
1 0 0 0 1 1 1 0 0 0 - s s s s s P P 0 i i i i i 1 1 1 x x x x x Rx+=rol(Rs,#u5)
ICLASS RegType x5 Parse MajOp
1 1 0 1 1 1 1 0 i i i x x x x x P P i I I I I I i i i 0 i 1 0 - Rx=add(#u8,asl(Rx,#U5))
1 1 0 1 1 1 1 0 i i i x x x x x P P i I I I I I i i i 0 i 1 1 - Rx=sub(#u8,asl(Rx,#U5))
1 1 0 1 1 1 1 0 i i i x x x x x P P i I I I I I i i i 1 i 1 0 - Rx=add(#u8,lsr(Rx,#U5))
1 1 0 1 1 1 1 0 i i i x x x x x P P i I I I I I i i i 1 i 1 1 - Rx=sub(#u8,lsr(Rx,#U5))
Syntax Behavior
Rd=addasl(Rt,Rs,#u3) Rd = Rt + Rs << #u;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType Maj s5 Parse t5 Min d5
1 1 0 0 0 1 0 0 0 0 0 s s s s s P P 0 t t t t t i i i d d d d d Rd=addasl(Rt,Rs,#u3)
Rss # / Rt Rs # / Rt
64-bit shift value Shift amount 32-bit shift value Shift amount
Syntax Behavior
Rx=and(#u8,asl(Rx,#U5)) Rx=apply_extension(#u)&(Rx<<#U);
Rx=and(#u8,lsr(Rx,#U5)) Rx=apply_extension(#u)&(((unsigned
int)Rx)>>#U);
Rx=or(#u8,asl(Rx,#U5)) Rx=apply_extension(#u)|(Rx<<#U);
Rx=or(#u8,lsr(Rx,#U5)) Rx=apply_extension(#u)|(((unsigned
int)Rx)>>#U);
Rx[&|]=asl(Rs,#u5) Rx = Rx [|&] Rs << #u;
Rx[&|]=asr(Rs,#u5) Rx = Rx [|&] Rs >> #u;
Rx[&|]=lsr(Rs,#u5) Rx = Rx [|&] Rs >>> #u;
Rx[&|]=rol(Rs,#u5) Rx = Rx [|&] Rs <<R #u;
Syntax Behavior
Rxx[&|]=asr(Rss,#u6) Rxx = Rxx [|&] Rss >> #u;
Rxx[&|]=lsr(Rss,#u6) Rxx = Rxx [|&] Rss >>> #u;
Rxx[&|]=rol(Rss,#u6) Rxx = Rxx [|&] Rss <<R #u;
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp x5
1 0 0 0 0 0 1 0 0 1 - s s s s s P P i i i i i i 0 0 0 x x x x x Rxx&=asr(Rss,#u6)
1 0 0 0 0 0 1 0 0 1 - s s s s s P P i i i i i i 0 0 1 x x x x x Rxx&=lsr(Rss,#u6)
1 0 0 0 0 0 1 0 0 1 - s s s s s P P i i i i i i 0 1 0 x x x x x Rxx&=asl(Rss,#u6)
1 0 0 0 0 0 1 0 0 1 - s s s s s P P i i i i i i 0 1 1 x x x x x Rxx&=rol(Rss,#u6)
1 0 0 0 0 0 1 0 0 1 - s s s s s P P i i i i i i 1 0 0 x x x x x Rxx|=asr(Rss,#u6)
1 0 0 0 0 0 1 0 0 1 - s s s s s P P i i i i i i 1 0 1 x x x x x Rxx|=lsr(Rss,#u6)
1 0 0 0 0 0 1 0 0 1 - s s s s s P P i i i i i i 1 1 0 x x x x x Rxx|=asl(Rss,#u6)
1 0 0 0 0 0 1 0 0 1 - s s s s s P P i i i i i i 1 1 1 x x x x x Rxx|=rol(Rss,#u6)
1 0 0 0 0 0 1 0 1 0 - s s s s s P P i i i i i i 0 0 1 x x x x x Rxx^=lsr(Rss,#u6)
1 0 0 0 0 0 1 0 1 0 - s s s s s P P i i i i i i 0 1 0 x x x x x Rxx^=asl(Rss,#u6)
1 0 0 0 0 0 1 0 1 0 - s s s s s P P i i i i i i 0 1 1 x x x x x Rxx^=rol(Rss,#u6)
1 0 0 0 1 1 1 0 0 1 - s s s s s P P 0 i i i i i 0 0 0 x x x x x Rx&=asr(Rs,#u5)
1 0 0 0 1 1 1 0 0 1 - s s s s s P P 0 i i i i i 0 0 1 x x x x x Rx&=lsr(Rs,#u5)
1 0 0 0 1 1 1 0 0 1 - s s s s s P P 0 i i i i i 0 1 0 x x x x x Rx&=asl(Rs,#u5)
1 0 0 0 1 1 1 0 0 1 - s s s s s P P 0 i i i i i 0 1 1 x x x x x Rx&=rol(Rs,#u5)
1 0 0 0 1 1 1 0 0 1 - s s s s s P P 0 i i i i i 1 0 0 x x x x x Rx|=asr(Rs,#u5)
1 0 0 0 1 1 1 0 0 1 - s s s s s P P 0 i i i i i 1 0 1 x x x x x Rx|=lsr(Rs,#u5)
1 0 0 0 1 1 1 0 0 1 - s s s s s P P 0 i i i i i 1 1 0 x x x x x Rx|=asl(Rs,#u5)
1 0 0 0 1 1 1 0 0 1 - s s s s s P P 0 i i i i i 1 1 1 x x x x x Rx|=rol(Rs,#u5)
1 0 0 0 1 1 1 0 1 0 - s s s s s P P 0 i i i i i 0 0 1 x x x x x Rx^=lsr(Rs,#u5)
1 0 0 0 1 1 1 0 1 0 - s s s s s P P 0 i i i i i 0 1 0 x x x x x Rx^=asl(Rs,#u5)
1 0 0 0 1 1 1 0 1 0 - s s s s s P P 0 i i i i i 0 1 1 x x x x x Rx^=rol(Rs,#u5)
ICLASS RegType x5 Parse MajOp
1 1 0 1 1 1 1 0 i i i x x x x x P P i I I I I I i i i 0 i 0 0 - Rx=and(#u8,asl(Rx,#U5))
1 1 0 1 1 1 1 0 i i i x x x x x P P i I I I I I i i i 0 i 0 1 - Rx=or(#u8,asl(Rx,#U5))
1 1 0 1 1 1 1 0 i i i x x x x x P P i I I I I I i i i 1 i 0 0 - Rx=and(#u8,lsr(Rx,#U5))
1 1 0 1 1 1 1 0 i i i x x x x x P P i I I I I I i i i 1 i 0 1 - Rx=or(#u8,lsr(Rx,#U5))
Lost Rs
Sign-ext
+1
32-bit add
Rd
Syntax Behavior
Rd=asr(Rs,#u5):rnd Rd = ((Rs >> #u)+1) >> 1;
Rd=asrrnd(Rs,#u5) if ("#u5==0") {
Assembler mapped to: "Rd=Rs";
} else {
Assembler mapped to: "Rd=asr(Rs,#u5-1):rnd";
}
Rdd=asr(Rss,#u6):rnd tmp = Rss >> #u;
rnd = tmp & 1;
Rdd = tmp >> 1 + rnd;
Rdd=asrrnd(Rss,#u6) if ("#u6==0") {
Assembler mapped to: "Rdd=Rss";
} else {
Assembler mapped to: "Rdd=asr(Rss,#u6-1):rnd";
}
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 0 0 0 0 1 1 0 s s s s s P P i i i i i i 1 1 1 d d d d d Rdd=asr(Rss,#u6):rnd
1 0 0 0 1 1 0 0 0 1 0 s s s s s P P 0 i i i i i 0 0 0 d d d d d Rd=asr(Rs,#u5):rnd
Syntax Behavior
Rd=asl(Rs,#u5):sat Rd = sat32(sxt32->64(Rs) << #u);
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 1 1 0 0 0 1 0 s s s s s P P 0 i i i i i 0 1 0 d d d d d Rd=asl(Rs,#u5):sat
Shift by register
The shift amount is the least significant seven bits of Rt, treated as a two's complement value. If
the shift amount is negative (bit 6 of Rt is set), reverse the direction of the shift indicated in the
opcode (see Figure).
The source data to shift is always performed as a 64-bit shift. When the Rs source register is a 32-
bit register, this register is first sign or zero-extended to 64-bits. Arithmetic shifts sign-extend the
32-bit source to 64-bits, whereas logical shifts zero extend.
The 64-bit source value is then right or left shifted based on the shift amount and the type of
instruction. Arithmetic right shifts place the sign bit of the source value in the vacated positions.
Logical right shifts place zeros in the vacated positions.
Lost Rs Lost Rs
Sign-ext Rd Zero-fill Rd
ASL w/ positive Rt
LSL w/ positive Rt
ASR w/ negative Rt
LSR w/ negative Rt
Lost Rs
Zero-fill Rd
Syntax Behavior
Rd=asl(Rs,Rt) shamt=sxt7->32(Rt);
Rd = (shamt>0)?(sxt32->64(Rs)<<shamt):(sxt32-
>64(Rs)>>shamt);
Rd=asr(Rs,Rt) shamt=sxt7->32(Rt);
Rd = (shamt>0)?(sxt32->64(Rs)>>shamt):(sxt32-
>64(Rs)<<shamt);
Rd=lsl(Rs,Rt) shamt=sxt7->32(Rt);
Rd = (shamt>0)?(zxt32->64(Rs)<<shamt):(zxt32-
>64(Rs)>>>shamt);
Syntax Behavior
Rd=lsr(Rs,Rt) shamt=sxt7->32(Rt);
Rd = (shamt>0)?(zxt32->64(Rs)>>>shamt):(zxt32-
>64(Rs)<<shamt);
Rdd=asl(Rss,Rt) shamt=sxt7->32(Rt);
Rdd = (shamt>0)?(Rss<<shamt):(Rss>>shamt);
Rdd=asr(Rss,Rt) shamt=sxt7->32(Rt);
Rdd = (shamt>0)?(Rss>>shamt):(Rss<<shamt);
Rdd=lsl(Rss,Rt) shamt=sxt7->32(Rt);
Rdd = (shamt>0)?(Rss<<shamt):(Rss>>>shamt);
Rdd=lsr(Rss,Rt) shamt=sxt7->32(Rt);
Rdd = (shamt>0)?(Rss>>>shamt):(Rss<<shamt);
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType Maj s5 Parse t5 Min d5
1 1 0 0 0 0 1 1 1 0 - s s s s s P P - t t t t t 0 0 - d d d d d Rdd=asr(Rss,Rt)
1 1 0 0 0 0 1 1 1 0 - s s s s s P P - t t t t t 0 1 - d d d d d Rdd=lsr(Rss,Rt)
1 1 0 0 0 0 1 1 1 0 - s s s s s P P - t t t t t 1 0 - d d d d d Rdd=asl(Rss,Rt)
1 1 0 0 0 0 1 1 1 0 - s s s s s P P - t t t t t 1 1 - d d d d d Rdd=lsl(Rss,Rt)
1 1 0 0 0 1 1 0 0 1 - s s s s s P P - t t t t t 0 0 - d d d d d Rd=asr(Rs,Rt)
1 1 0 0 0 1 1 0 0 1 - s s s s s P P - t t t t t 0 1 - d d d d d Rd=lsr(Rs,Rt)
1 1 0 0 0 1 1 0 0 1 - s s s s s P P - t t t t t 1 0 - d d d d d Rd=asl(Rs,Rt)
1 1 0 0 0 1 1 0 0 1 - s s s s s P P - t t t t t 1 1 - d d d d d Rd=lsl(Rs,Rt)
ICLASS RegType Maj Parse t5 Min d5
1 1 0 0 0 1 1 0 1 0 - i i i i i P P - t t t t t 1 1 i d d d d d Rd=lsl(#s6,Rt)
Rss # / Rt Rs # / Rt
64-bit shift value Shift amount 32-bit shift value Shift amount
Syntax Behavior
Rx[+-]=asl(Rs,Rt) shamt=sxt7->32(Rt);
Rx = Rx [+-] (shamt>0)?(sxt32->64(Rs)<<shamt):(sxt32-
>64(Rs)>>shamt);
Rx[+-]=asr(Rs,Rt) shamt=sxt7->32(Rt);
Rx = Rx [+-] (shamt>0)?(sxt32->64(Rs)>>shamt):(sxt32-
>64(Rs)<<shamt);
Rx[+-]=lsl(Rs,Rt) shamt=sxt7->32(Rt);
Rx = Rx [+-] (shamt>0)?(zxt32->64(Rs)<<shamt):(zxt32-
>64(Rs)>>>shamt);
Rx[+-]=lsr(Rs,Rt) shamt=sxt7->32(Rt);
Rx = Rx [+-] (shamt>0)?(zxt32->64(Rs)>>>shamt):(zxt32-
>64(Rs)<<shamt);
Rxx[+- shamt=sxt7->32(Rt);
]=asl(Rss,Rt) Rxx = Rxx [+-] (shamt>0)?(Rss<<shamt):(Rss>>shamt);
Syntax Behavior
Rxx[+- shamt=sxt7->32(Rt);
]=asr(Rss,Rt) Rxx = Rxx [+-] (shamt>0)?(Rss>>shamt):(Rss<<shamt);
Rxx[+- shamt=sxt7->32(Rt);
]=lsl(Rss,Rt) Rxx = Rxx [+-] (shamt>0)?(Rss<<shamt):(Rss>>>shamt);
Rxx[+- shamt=sxt7->32(Rt);
]=lsr(Rss,Rt) Rxx = Rxx [+-] (shamt>0)?(Rss>>>shamt):(Rss<<shamt);
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType Maj s5 Parse t5 Min x5
1 1 0 0 1 0 1 1 1 0 0 s s s s s P P - t t t t t 0 0 - x x x x x Rxx-=asr(Rss,Rt)
1 1 0 0 1 0 1 1 1 0 0 s s s s s P P - t t t t t 0 1 - x x x x x Rxx-=lsr(Rss,Rt)
1 1 0 0 1 0 1 1 1 0 0 s s s s s P P - t t t t t 1 0 - x x x x x Rxx-=asl(Rss,Rt)
1 1 0 0 1 0 1 1 1 0 0 s s s s s P P - t t t t t 1 1 - x x x x x Rxx-=lsl(Rss,Rt)
1 1 0 0 1 0 1 1 1 1 0 s s s s s P P - t t t t t 0 0 - x x x x x Rxx+=asr(Rss,Rt)
1 1 0 0 1 0 1 1 1 1 0 s s s s s P P - t t t t t 0 1 - x x x x x Rxx+=lsr(Rss,Rt)
1 1 0 0 1 0 1 1 1 1 0 s s s s s P P - t t t t t 1 0 - x x x x x Rxx+=asl(Rss,Rt)
1 1 0 0 1 0 1 1 1 1 0 s s s s s P P - t t t t t 1 1 - x x x x x Rxx+=lsl(Rss,Rt)
1 1 0 0 1 1 0 0 1 0 - s s s s s P P - t t t t t 0 0 - x x x x x Rx-=asr(Rs,Rt)
1 1 0 0 1 1 0 0 1 0 - s s s s s P P - t t t t t 0 1 - x x x x x Rx-=lsr(Rs,Rt)
1 1 0 0 1 1 0 0 1 0 - s s s s s P P - t t t t t 1 0 - x x x x x Rx-=asl(Rs,Rt)
1 1 0 0 1 1 0 0 1 0 - s s s s s P P - t t t t t 1 1 - x x x x x Rx-=lsl(Rs,Rt)
1 1 0 0 1 1 0 0 1 1 - s s s s s P P - t t t t t 0 0 - x x x x x Rx+=asr(Rs,Rt)
1 1 0 0 1 1 0 0 1 1 - s s s s s P P - t t t t t 0 1 - x x x x x Rx+=lsr(Rs,Rt)
1 1 0 0 1 1 0 0 1 1 - s s s s s P P - t t t t t 1 0 - x x x x x Rx+=asl(Rs,Rt)
1 1 0 0 1 1 0 0 1 1 - s s s s s P P - t t t t t 1 1 - x x x x x Rx+=lsl(Rs,Rt)
Rss # / Rt Rs # / Rt
64-bit shift value Shift amount 32-bit shift value Shift amount
Syntax Behavior
Rx[&|]=asl(Rs,Rt) shamt=sxt7->32(Rt);
Rx = Rx [|&] (shamt>0)?(sxt32->64(Rs)<<shamt):(sxt32-
>64(Rs)>>shamt);
Rx[&|]=asr(Rs,Rt) shamt=sxt7->32(Rt);
Rx = Rx [|&] (shamt>0)?(sxt32->64(Rs)>>shamt):(sxt32-
>64(Rs)<<shamt);
Rx[&|]=lsl(Rs,Rt) shamt=sxt7->32(Rt);
Rx = Rx [|&] (shamt>0)?(zxt32->64(Rs)<<shamt):(zxt32-
>64(Rs)>>>shamt);
Rx[&|]=lsr(Rs,Rt) shamt=sxt7->32(Rt);
Rx = Rx [|&] (shamt>0)?(zxt32->64(Rs)>>>shamt):(zxt32-
>64(Rs)<<shamt);
Syntax Behavior
Rxx[&|]=asl(Rss,Rt shamt=sxt7->32(Rt);
) Rxx = Rxx [|&] (shamt>0)?(Rss<<shamt):(Rss>>shamt);
Rxx[&|]=asr(Rss,Rt shamt=sxt7->32(Rt);
) Rxx = Rxx [|&] (shamt>0)?(Rss>>shamt):(Rss<<shamt);
Rxx[&|]=lsl(Rss,Rt shamt=sxt7->32(Rt);
) Rxx = Rxx [|&] (shamt>0)?(Rss<<shamt):(Rss>>>shamt);
Rxx[&|]=lsr(Rss,Rt shamt=sxt7->32(Rt);
) Rxx = Rxx [|&] (shamt>0)?(Rss>>>shamt):(Rss<<shamt);
Rxx^=asl(Rss,Rt) shamt=sxt7->32(Rt);
Rxx = Rxx ^ (shamt>0)?(Rss<<shamt):(Rss>>shamt);
Rxx^=asr(Rss,Rt) shamt=sxt7->32(Rt);
Rxx = Rxx ^ (shamt>0)?(Rss>>shamt):(Rss<<shamt);
Rxx^=lsl(Rss,Rt) shamt=sxt7->32(Rt);
Rxx = Rxx ^ (shamt>0)?(Rss<<shamt):(Rss>>>shamt);
Rxx^=lsr(Rss,Rt) shamt=sxt7->32(Rt);
Rxx = Rxx ^ (shamt>0)?(Rss>>>shamt):(Rss<<shamt);
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType Maj s5 Parse t5 Min x5
1 1 0 0 1 0 1 1 0 0 0 s s s s s P P - t t t t t 0 0 - x x x x x Rxx|=asr(Rss,Rt)
1 1 0 0 1 0 1 1 0 0 0 s s s s s P P - t t t t t 0 1 - x x x x x Rxx|=lsr(Rss,Rt)
1 1 0 0 1 0 1 1 0 0 0 s s s s s P P - t t t t t 1 0 - x x x x x Rxx|=asl(Rss,Rt)
1 1 0 0 1 0 1 1 0 0 0 s s s s s P P - t t t t t 1 1 - x x x x x Rxx|=lsl(Rss,Rt)
1 1 0 0 1 0 1 1 0 1 0 s s s s s P P - t t t t t 0 0 - x x x x x Rxx&=asr(Rss,Rt)
1 1 0 0 1 0 1 1 0 1 0 s s s s s P P - t t t t t 0 1 - x x x x x Rxx&=lsr(Rss,Rt)
1 1 0 0 1 0 1 1 0 1 0 s s s s s P P - t t t t t 1 0 - x x x x x Rxx&=asl(Rss,Rt)
1 1 0 0 1 0 1 1 0 1 0 s s s s s P P - t t t t t 1 1 - x x x x x Rxx&=lsl(Rss,Rt)
1 1 0 0 1 0 1 1 0 1 1 s s s s s P P - t t t t t 0 0 - x x x x x Rxx^=asr(Rss,Rt)
1 1 0 0 1 0 1 1 0 1 1 s s s s s P P - t t t t t 0 1 - x x x x x Rxx^=lsr(Rss,Rt)
1 1 0 0 1 0 1 1 0 1 1 s s s s s P P - t t t t t 1 0 - x x x x x Rxx^=asl(Rss,Rt)
1 1 0 0 1 0 1 1 0 1 1 s s s s s P P - t t t t t 1 1 - x x x x x Rxx^=lsl(Rss,Rt)
1 1 0 0 1 1 0 0 0 0 - s s s s s P P - t t t t t 0 0 - x x x x x Rx|=asr(Rs,Rt)
1 1 0 0 1 1 0 0 0 0 - s s s s s P P - t t t t t 0 1 - x x x x x Rx|=lsr(Rs,Rt)
1 1 0 0 1 1 0 0 0 0 - s s s s s P P - t t t t t 1 0 - x x x x x Rx|=asl(Rs,Rt)
1 1 0 0 1 1 0 0 0 0 - s s s s s P P - t t t t t 1 1 - x x x x x Rx|=lsl(Rs,Rt)
1 1 0 0 1 1 0 0 0 1 - s s s s s P P - t t t t t 0 0 - x x x x x Rx&=asr(Rs,Rt)
1 1 0 0 1 1 0 0 0 1 - s s s s s P P - t t t t t 0 1 - x x x x x Rx&=lsr(Rs,Rt)
1 1 0 0 1 1 0 0 0 1 - s s s s s P P - t t t t t 1 0 - x x x x x Rx&=asl(Rs,Rt)
1 1 0 0 1 1 0 0 0 1 - s s s s s P P - t t t t t 1 1 - x x x x x Rx&=lsl(Rs,Rt)
Notes
■ If saturation occurs during execution of this instruction (a result is clamped to either
maximum or minimum values), the OVF bit in the status register is set. OVF remains set until
explicitly cleared by a transfer to the status register.
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType Maj s5 Parse t5 Min d5
1 1 0 0 0 1 1 0 0 0 - s s s s s P P - t t t t t 0 0 - d d d d d Rd=asr(Rs,Rt):sat
1 1 0 0 0 1 1 0 0 0 - s s s s s P P - t t t t t 1 0 - d d d d d Rd=asl(Rs,Rt):sat
0 0 0 0
Rdd
Syntax Behavior
Rdd=vaslh(Rss,#u4) for (i=0;i<4;i++) {
Rdd.h[i]=(Rss.h[i]<<#u);
}
Rdd=vasrh(Rss,#u4) for (i=0;i<4;i++) {
Rdd.h[i]=(Rss.h[i]>>#u);
}
Rdd=vlsrh(Rss,#u4) for (i=0;i<4;i++) {
Rdd.h[i]=(Rss.uh[i]>>#u);
}
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 0 0 0 0 1 0 0 s s s s s P P 0 0 i i i i 0 0 0 d d d d d Rdd=vasrh(Rss,#u4)
1 0 0 0 0 0 0 0 1 0 0 s s s s s P P 0 0 i i i i 0 0 1 d d d d d Rdd=vlsrh(Rss,#u4)
1 0 0 0 0 0 0 0 1 0 0 s s s s s P P 0 0 i i i i 0 1 0 d d d d d Rdd=vaslh(Rss,#u4)
+ + + +
lost lost lost lost
Syntax Behavior
Rdd=vasrh(Rss,#u4):raw for (i=0;i<4;i++) {
Rdd.h[i]=( ((Rss.h[i] >> #u)+1)>>1 );
}
Rdd=vasrh(Rss,#u4):rnd if ("#u4==0") {
Assembler mapped to: "Rdd=Rss";
} else {
Assembler mapped to: "Rdd=vasrh(Rss,#u4-1):raw";
}
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 0 0 0 0 0 0 1 s s s s s P P 0 0 i i i i 0 0 0 d d d d d Rdd=vasrh(Rss,#u4):raw
+ + + +
lost lost lost lost
Syntax Behavior
Rd=vasrhub(Rss,#u4):raw for (i=0;i<4;i++) {
Rd.b[i]=usat8(((Rss.h[i] >> #u )+1)>>1);
}
Rd=vasrhub(Rss,#u4):rnd:sa if ("#u4==0") {
t Assembler mapped to: "Rd=vsathub(Rss)";
} else {
Assembler mapped to: "Rd=vasrhub(Rss,#u4-
1):raw";
}
Rd=vasrhub(Rss,#u4):sat for (i=0;i<4;i++) {
Rd.b[i]=usat8(Rss.h[i] >> #u);
}
Notes
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 1 0 0 0 0 1 1 s s s s s P P 0 0 i i i i 1 0 0 d d d d d Rd=vasrhub(Rss,#u4):raw
1 0 0 0 1 0 0 0 0 1 1 s s s s s P P 0 0 i i i i 1 0 1 d d d d d Rd=vasrhub(Rss,#u4):sat
0 0 0 0
Rdd
Syntax Behavior
Rdd=vaslh(Rss,Rt for (i=0;i<4;i++) {
) Rdd.h[i]=(sxt7->32(Rt)>0)?(sxt16->64(Rss.h[i])<<sxt7-
>32 (Rt)):(sxt 16->64(Rss.h[i])>>sxt7->32(Rt));
}
Rdd=vasrh(Rss,Rt for (i=0;i<4;i++) {
) Rdd.h[i]=(sxt7->32(Rt)>0)?(sxt16->64(Rss.h[i])>>sxt7-
>32 (Rt)):(sxt 16->64(Rss.h[i])<<sxt7->32(Rt));
}
Rdd=vlslh(Rss,Rt for (i=0;i<4;i++) {
) Rdd.h[i]=(sxt7->32(Rt)>0)?(zxt16->64(Rss.uh[i])<<sxt7-
>32(Rt)):(zxt16->64(Rss.uh[i])>>>sxt7->32(Rt));
}
Rdd=vlsrh(Rss,Rt for (i=0;i<4;i++) {
) Rdd.h[i]=(sxt7->32(Rt)>0)?(zxt16->64(Rss.uh[i])>>>sxt7-
>32 (Rt)):(zxt 16->64(Rss.uh[i])<<sxt7->32(Rt));
}
Notes
■ If the number of bits to shift is greater than the width of the vector element, the result is
either all sign-bits (for arithmetic right shifts) or all zeros for logical and left shifts.
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType Maj s5 Parse t5 Min d5
1 1 0 0 0 0 1 1 0 1 - s s s s s P P - t t t t t 0 0 - d d d d d Rdd=vasrh(Rss,Rt)
1 1 0 0 0 0 1 1 0 1 - s s s s s P P - t t t t t 0 1 - d d d d d Rdd=vlsrh(Rss,Rt)
1 1 0 0 0 0 1 1 0 1 - s s s s s P P - t t t t t 1 0 - d d d d d Rdd=vaslh(Rss,Rt)
1 1 0 0 0 0 1 1 0 1 - s s s s s P P - t t t t t 1 1 - d d d d d Rdd=vlslh(Rss,Rt)
0 0 0 Rdd
Syntax Behavior
Rdd=vaslw(Rss,#u5) for (i=0;i<2;i++) {
Rdd.w[i]=(Rss.w[i]<<#u);
}
Rdd=vasrw(Rss,#u5) for (i=0;i<2;i++) {
Rdd.w[i]=(Rss.w[i]>>#u);
}
Rdd=vlsrw(Rss,#u5) for (i=0;i<2;i++) {
Rdd.w[i]=(Rss.uw[i]>>#u);
}
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 0 0 0 0 0 1 0 s s s s s P P 0 i i i i i 0 0 0 d d d d d Rdd=vasrw(Rss,#u5)
1 0 0 0 0 0 0 0 0 1 0 s s s s s P P 0 i i i i i 0 0 1 d d d d d Rdd=vlsrw(Rss,#u5)
1 0 0 0 0 0 0 0 0 1 0 s s s s s P P 0 i i i i i 0 1 0 d d d d d Rdd=vaslw(Rss,#u5)
0 0 0 Rdd
Syntax Behavior
Rdd=vaslw(Rss,Rt) for (i=0;i<2;i++) {
Rdd.w[i]=(sxt7->32(Rt)>0)?(sxt32->64(Rss.w[i])<<sxt7-
>32(Rt)):(sxt32->64(Rss.w[i])>>sxt7->32(Rt));
}
Rdd=vasrw(Rss,Rt) for (i=0;i<2;i++) {
Rdd.w[i]=(sxt7->32(Rt)>0)?(sxt32->64(Rss.w[i])>>sxt7-
>32 (Rt)):(sxt 32->64(Rss.w[i])<<sxt7->32(Rt));
}
Rdd=vlslw(Rss,Rt) for (i=0;i<2;i++) {
Rdd.w[i]=(sxt7->32(Rt)>0)?(zxt32->64(Rss.uw[i])<<sxt7-
>32(Rt)):(zxt32->64(Rss.uw[i])>>>sxt7->32(Rt));
}
Rdd=vlsrw(Rss,Rt) for (i=0;i<2;i++) {
Rdd.w[i]=(sxt7->32(Rt)>0)?(zxt32->64(Rss.uw[i])>>>sxt7-
>32(Rt)):(zxt32->64(Rss.uw[i])<<sxt7->32(Rt));
}
Notes
■ If the number of bits to shift is greater than the width of the vector element, the result is
either all sign-bits (for arithmetic right shifts) or all zeros for logical and left shifts.
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType Maj s5 Parse t5 Min d5
1 1 0 0 0 0 1 1 0 0 - s s s s s P P - t t t t t 0 0 - d d d d d Rdd=vasrw(Rss,Rt)
1 1 0 0 0 0 1 1 0 0 - s s s s s P P - t t t t t 0 1 - d d d d d Rdd=vlsrw(Rss,Rt)
1 1 0 0 0 0 1 1 0 0 - s s s s s P P - t t t t t 1 0 - d d d d d Rdd=vaslw(Rss,Rt)
1 1 0 0 0 0 1 1 0 0 - s s s s s P P - t t t t t 1 1 - d d d d d Rdd=vlslw(Rss,Rt)
sxt sxt
Rd
Syntax Behavior
Rd=vasrw(Rss,#u5 for (i=0;i<2;i++) {
) Rd.h[i]=(Rss.w[i]>>#u).h[0];
}
Rd=vasrw(Rss,Rt) for (i=0;i<2;i++) {
Rd.h[i]=(sxt7->32(Rt)>0)?(sxt32->64(Rss.w[i])>>sxt7-
>32(Rt)):(sxt32->64(Rss.w[i])<<sxt7->32(Rt)).h[0];
}
Intrinsics
Encoding
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ICLASS RegType MajOp s5 Parse MinOp d5
1 0 0 0 1 0 0 0 1 1 0 s s s s s P P 0 i i i i i 0 1 0 d d d d d Rd=vasrw(Rss,#u5)
ICLASS RegType s5 Parse t5 Min d5
1 1 0 0 0 1 0 1 - - - s s s s s P P - t t t t t 0 1 0 d d d d d Rd=vasrw(Rss,Rt)
add
if ([!]Pu[.new]) Rd=add(Rs,#s8) 178
if ([!]Pu[.new]) Rd=add(Rs,Rt) 178
Rd=add(#u6,mpyi(Rs,#U6)) 484
Rd=add(#u6,mpyi(Rs,Rt)) 484
Rd=add(Rs,#s16) 156
Rd=add(Rs,add(Ru,#s6)) 335
Rd=add(Rs,Rt) 156
Rd=add(Rs,Rt):sat 156
Rd=add(Rs,Rt):sat:deprecated 337
Rd=add(Rt.[HL],Rs.[HL])[:sat]:<<16 339
Rd=add(Rt.L,Rs.[HL])[:sat] 339
Rd=add(Ru,mpyi(#u6:2,Rs)) 484
Rd=add(Ru,mpyi(Rs,#u6)) 484
Rdd=add(Rs,Rtt) 337
Rdd=add(Rss,Rtt,Px):carry 341
Rdd=add(Rss,Rtt) 337
Rdd=add(Rss,Rtt):raw:hi 337
Rdd=add(Rss,Rtt):raw:lo 337
Rdd=add(Rss,Rtt):sat 337
Rx+=add(Rs,#s8) 335
Rx+=add(Rs,Rt) 335
Rx-=add(Rs,#s8) 335
Rx-=add(Rs,Rt) 335
Ry=add(Ru,mpyi(Ry,Rs)) 485
addasl
Rd=addasl(Rt,Rs,#u3) 594
all8
Pd=all8(Ps) 197
allocframe
allocframe(#u11:3) 312
allocframe(Rx,#u11:3):raw 312
and
if ([!]Pu[.new]) Rd=and(Rs,Rt) 183
Pd=and(Ps,and(Pt,[!]Pu)) 203
Pd=and(Pt,[!]Ps) 203
Rd=and(Rs,#s10) 158
Rd=and(Rs,Rt) 158
Rd=and(Rt,~Rs) 158
Rdd=and(Rss,Rtt) 343
Rdd=and(Rtt,~Rss) 343
Rx[&|^]=and(Rs,~Rt) 346
Rx[&|^]=and(Rs,Rt) 346
Rx|=and(Rs,#s10) 346
any8
Pd=any8(Ps) 197
asl
Rd=asl(Rs,#u5) 589
Rd=asl(Rs,#u5):sat 600
Rd=asl(Rs,Rt) 601
Rd=asl(Rs,Rt):sat 610
Rdd=asl(Rss,#u6) 589
Rdd=asl(Rss,Rt) 602
Rx[&|]=asl(Rs,#u5) 595
Rx[&|]=asl(Rs,Rt) 607
Rx[+-]=asl(Rs,#u5) 591
Rx[+-]=asl(Rs,Rt) 604
Rx^=asl(Rs,#u5) 595
Rx=add(#u8,asl(Rx,#U5)) 591
Rx=and(#u8,asl(Rx,#U5)) 595
Rx=or(#u8,asl(Rx,#U5)) 595
Rx=sub(#u8,asl(Rx,#U5)) 591
Rxx[&|]=asl(Rss,#u6) 595
Rxx[&|]=asl(Rss,Rt) 608
Rxx[+-]=asl(Rss,#u6) 591
Rxx[+-]=asl(Rss,Rt) 604
Rxx^=asl(Rss,#u6) 596
Rxx^=asl(Rss,Rt) 608
aslh
if ([!]Pu[.new]) Rd=aslh(Rs) 180
Rd=aslh(Rs) 176
asr
Rd=asr(Rs,#u5) 589
Rd=asr(Rs,#u5):rnd 598
Rd=asr(Rs,Rt) 601
Rd=asr(Rs,Rt):sat 610
Rdd=asr(Rss,#u6) 589
Rdd=asr(Rss,#u6):rnd 598
Rdd=asr(Rss,Rt) 602
Rx[&|]=asr(Rs,#u5) 595
Rx[&|]=asr(Rs,Rt) 607
Rx[+-]=asr(Rs,#u5) 591
Rx[+-]=asr(Rs,Rt) 604
Rxx[&|]=asr(Rss,#u6) 596
Rxx[&|]=asr(Rss,Rt) 608
Rxx[+-]=asr(Rss,#u6) 591
Rxx[+-]=asr(Rss,Rt) 605
Rxx^=asr(Rss,Rt) 608
asrh
if ([!]Pu[.new]) Rd=asrh(Rs) 180
Rd=asrh(Rs) 176
asrrnd
Rd=asrrnd(Rs,#u5) 598
Rdd=asrrnd(Rss,#u6) 598
B
barrier
barrier 317
bitsclr
Pd=[!]bitsclr(Rs,#u6) 574
Pd=[!]bitsclr(Rs,Rt) 574
bitsplit
Rdd=bitsplit(Rs,#u5) 424
Rdd=bitsplit(Rs,Rt) 424
bitsset
Pd=[!]bitsset(Rs,Rt) 574
boundscheck
Pd=boundscheck(Rs,Rtt) 567
Pd=boundscheck(Rss,Rtt):raw:hi 567
Pd=boundscheck(Rss,Rtt):raw:lo 567
brev
Rd=brev(Rs) 421
Rdd=brev(Rss) 421
brkpt
brkpt 318
C
call
call #r22:2 211
if ([!]Pu) call #r15:2 211
callr
callr Rs 206
if ([!]Pu) callr Rs 206
callrh
callrh Rs 206, 207
cl0
Rd=cl0(Rs) 409
Rd=cl0(Rss) 409
cl1
Rd=cl1(Rs) 409
Rd=cl1(Rss) 409
clb
Rd=add(clb(Rs),#s6) 409
Rd=add(clb(Rss),#s6) 409
Rd=clb(Rs) 409
Rd=clb(Rss) 409
clip
Rd=clip(Rs,#u5) 342
clrbit
memb(Rs+#u6:0)=clrbit(#U5) 268
memh(Rs+#u6:1)=clrbit(#U5) 270
memw(Rs+#u6:2)=clrbit(#U5) 271
Rd=clrbit(Rs,#u5) 422
Rd=clrbit(Rs,Rt) 422
cmp.eq
if ([!]cmp.eq(Ns.new,#-1)) jump:<hint> #r9:2 272
if ([!]cmp.eq(Ns.new,#U5)) jump:<hint> #r9:2 272
if ([!]cmp.eq(Ns.new,Rt)) jump:<hint> #r9:2 272
p[01]=cmp.eq(Rs,#-1) 213
p[01]=cmp.eq(Rs,#U5) 213
p[01]=cmp.eq(Rs,Rt) 213
Pd=[!]cmp.eq(Rs,#s10) 191
Pd=[!]cmp.eq(Rs,Rt) 191
Pd=cmp.eq(Rss,Rtt) 573
Rd=[!]cmp.eq(Rs,#s8) 193
Rd=[!]cmp.eq(Rs,Rt) 193
cmp.ge
Pd=cmp.ge(Rs,#s8) 191
cmp.geu
Pd=cmp.geu(Rs,#u8) 191
cmp.gt
if ([!]cmp.gt(Ns.new,#-1)) jump:<hint> #r9:2 272
if ([!]cmp.gt(Ns.new,#U5)) jump:<hint> #r9:2 272
if ([!]cmp.gt(Ns.new,Rt)) jump:<hint> #r9:2 272
if ([!]cmp.gt(Rt,Ns.new)) jump:<hint> #r9:2 273
p[01]=cmp.gt(Rs,#-1) 213
p[01]=cmp.gt(Rs,#U5) 213
p[01]=cmp.gt(Rs,Rt) 213
Pd=[!]cmp.gt(Rs,#s10) 191
Pd=[!]cmp.gt(Rs,Rt) 191
Pd=cmp.gt(Rss,Rtt) 573
cmp.gtu
if ([!]cmp.gtu(Ns.new,#U5)) jump:<hint> #r9:2 273
if ([!]cmp.gtu(Ns.new,Rt)) jump:<hint> #r9:2 273
if ([!]cmp.gtu(Rt,Ns.new)) jump:<hint> #r9:2 273
p[01]=cmp.gtu(Rs,#U5) 214
p[01]=cmp.gtu(Rs,Rt) 214
Pd=[!]cmp.gtu(Rs,#u9) 191
Pd=[!]cmp.gtu(Rs,Rt) 191
Pd=cmp.gtu(Rss,Rtt) 573
cmp.lt
Pd=cmp.lt(Rs,Rt) 191
cmp.ltu
Pd=cmp.ltu(Rs,Rt) 191
cmpb.eq
Pd=cmpb.eq(Rs,#u8) 569
Pd=cmpb.eq(Rs,Rt) 569
cmpb.gt
Pd=cmpb.gt(Rs,#s8) 569
Pd=cmpb.gt(Rs,Rt) 569
cmpb.gtu
Pd=cmpb.gtu(Rs,#u7) 569
Pd=cmpb.gtu(Rs,Rt) 569
cmph.eq
Pd=cmph.eq(Rs,#s8) 571
Pd=cmph.eq(Rs,Rt) 571
cmph.gt
Pd=cmph.gt(Rs,#s8) 571
Pd=cmph.gt(Rs,Rt) 571
cmph.gtu
Pd=cmph.gtu(Rs,#u7) 571
Pd=cmph.gtu(Rs,Rt) 571
cmpy
Rd=cmpy(Rs,Rt)[:<<1]:rnd:sat 439
Rd=cmpy(Rs,Rt*)[:<<1]:rnd:sat 439
Rdd=cmpy(Rs,Rt)[:<<1]:sat 433
Rdd=cmpy(Rs,Rt*)[:<<1]:sat 433
Rxx+=cmpy(Rs,Rt)[:<<1]:sat 434
Rxx+=cmpy(Rs,Rt*)[:<<1]:sat 434
Rxx-=cmpy(Rs,Rt)[:<<1]:sat 434
Rxx-=cmpy(Rs,Rt*)[:<<1]:sat 434
cmpyi
Rdd=cmpyi(Rs,Rt) 437
Rxx+=cmpyi(Rs,Rt) 437
cmpyiw
Rd=cmpyiw(Rss,Rtt):<<1:rnd:sat 443
Rd=cmpyiw(Rss,Rtt):<<1:sat 443
Rd=cmpyiw(Rss,Rtt*):<<1:rnd:sat 443
Rd=cmpyiw(Rss,Rtt*):<<1:sat 443
Rdd=cmpyiw(Rss,Rtt) 444
Rdd=cmpyiw(Rss,Rtt*) 444
Rxx+=cmpyiw(Rss,Rtt) 444
Rxx+=cmpyiw(Rss,Rtt*) 444
cmpyiwh
Rd=cmpyiwh(Rss,Rt):<<1:rnd:sat 441
Rd=cmpyiwh(Rss,Rt*):<<1:rnd:sat 441
cmpyr
Rdd=cmpyr(Rs,Rt) 437
Rxx+=cmpyr(Rs,Rt) 437
cmpyrw
Rd=cmpyrw(Rss,Rtt):<<1:rnd:sat 443
Rd=cmpyrw(Rss,Rtt):<<1:sat 443
Rd=cmpyrw(Rss,Rtt*):<<1:rnd:sat 444
Rd=cmpyrw(Rss,Rtt*):<<1:sat 444
Rdd=cmpyrw(Rss,Rtt) 444
Rdd=cmpyrw(Rss,Rtt*) 444
Rxx+=cmpyrw(Rss,Rtt) 444
Rxx+=cmpyrw(Rss,Rtt*) 444
cmpyrwh
Rd=cmpyrwh(Rss,Rt):<<1:rnd:sat 441
Rd=cmpyrwh(Rss,Rt*):<<1:rnd:sat 441
combine
if ([!]Pu[.new]) Rdd=combine(Rs,Rt) 182
Rd=combine(Rt.[HL],Rs.[HL]) 172
Rdd=combine(#s8,#S8) 172
Rdd=combine(#s8,#U6) 172
Rdd=combine(#s8,Rs) 172
Rdd=combine(Rs,#s8) 172
Rdd=combine(Rs,Rt) 173
convert_d2df
Rdd=convert_d2df(Rss) 467
convert_d2sf
Rd=convert_d2sf(Rss) 467
convert_df2d
Rdd=convert_df2d(Rss) 469
Rdd=convert_df2d(Rss):chop 469
convert_df2sf
Rd=convert_df2sf(Rss) 466
convert_df2ud
Rdd=convert_df2ud(Rss) 469
Rdd=convert_df2ud(Rss):chop 469
convert_df2uw
Rd=convert_df2uw(Rss) 469
Rd=convert_df2uw(Rss):chop 469
convert_df2w
Rd=convert_df2w(Rss) 469
Rd=convert_df2w(Rss):chop 469
convert_sf2d
Rdd=convert_sf2d(Rs) 469
Rdd=convert_sf2d(Rs):chop 469
convert_sf2df
Rdd=convert_sf2df(Rs) 466
convert_sf2ud
Rdd=convert_sf2ud(Rs) 469
Rdd=convert_sf2ud(Rs):chop 469
convert_sf2uw
Rd=convert_sf2uw(Rs) 469
Rd=convert_sf2uw(Rs):chop 469
convert_sf2w
Rd=convert_sf2w(Rs) 469
Rd=convert_sf2w(Rs):chop 469
convert_ud2df
Rdd=convert_ud2df(Rss) 467
convert_ud2sf
Rd=convert_ud2sf(Rss) 467
convert_uw2df
Rdd=convert_uw2df(Rs) 467
convert_uw2sf
Rd=convert_uw2sf(Rs) 467
convert_w2df
Rdd=convert_w2df(Rs) 467
convert_w2sf
Rd=convert_w2sf(Rs) 467
cround
Rd=cround(Rs,#u5) 355
Rd=cround(Rs,Rt) 355
Rdd=cround(Rss,#u6) 355
Rdd=cround(Rss,Rt) 356
ct0
Rd=ct0(Rs) 412
Rd=ct0(Rss) 412
ct1
Rd=ct1(Rs) 412
Rd=ct1(Rss) 412
D
dccleana
dccleana(Rs) 320
dccleaninva
dccleaninva(Rs) 320
dcfetch
dcfetch(Rs) 319
dcfetch(Rs+#u11:3) 319
dcinva
dcinva(Rs) 320
dczeroa
dczeroa(Rs) 316
dealloc_return
dealloc_return 258
if ([!]Pv.new) Rdd=dealloc_return(Rs):nt:raw 258
if ([!]Pv.new) Rdd=dealloc_return(Rs):t:raw 258
if ([!]Pv) dealloc_return 258
if ([!]Pv) Rdd=dealloc_return(Rs):raw 258
nt
if ([!]Pv.new) dealloc_return:nt 258
Rdd=dealloc_return(Rs):raw 258
t
if ([!]Pv.new) dealloc_return:t 258
deallocframe
deallocframe 256
Rdd=deallocframe(Rs):raw 256
decbin
Rdd=decbin(Rss,Rtt) 540
deinterleave
Rdd=deinterleave(Rss) 418
dfadd
Rdd=dfadd(Rss,Rtt) 461
dfclass
Pd=dfclass(Rss,#u5) 462
dfcmp.eq
Pd=dfcmp.eq(Rss,Rtt) 464
dfcmp.ge
Pd=dfcmp.ge(Rss,Rtt) 464
dfcmp.gt
Pd=dfcmp.gt(Rss,Rtt) 464
dfcmp.uo
Pd=dfcmp.uo(Rss,Rtt) 464
dfmake
Rdd=dfmake(#u10):neg 478
Rdd=dfmake(#u10):pos 478
dfmax
Rdd=dfmax(Rss,Rtt) 479
dfmin
Rdd=dfmin(Rss,Rtt) 480
dfmpyfix
Rdd=dfmpyfix(Rss,Rtt) 481
dfmpyhh
Rxx+=dfmpyhh(Rss,Rtt) 473
dfmpylh
Rxx+=dfmpylh(Rss,Rtt) 473
dfmpyll
Rdd=dfmpyll(Rss,Rtt) 481
dfsub
Rdd=dfsub(Rss,Rtt) 483
diag
diag(Rs) 322
diag0
diag0(Rss,Rtt) 322
diag1
diag1(Rss,Rtt) 322
dmsyncht
Rd=dmsyncht 329
E
endloop0
endloop0 194
endloop01
endloop01 194
endloop1
endloop1 194
extract
Rd=extract(Rs,#u5,#U5) 413
Rd=extract(Rs,Rtt) 413
Rdd=extract(Rss,#u6,#U6) 413
Rdd=extract(Rss,Rtt) 414
extractu
Rd=extractu(Rs,#u5,#U5) 413
Rd=extractu(Rs,Rtt) 413
Rdd=extractu(Rss,#u6,#U6) 414
Rdd=extractu(Rss,Rtt) 414
F
fastcorner9
Pd=[!]fastcorner9(Ps,Pt) 196
H
hintjr
hintjr(Rs) 208
I
icinva
icinva(Rs) 323
if ([!]p[01].new) jump:<hint> #r9:2 213, 213, 213, 213, 213, 213, 214, 214, 214
insert
Rx=insert(Rs,#u5,#U5) 416
Rx=insert(Rs,Rtt) 416
Rxx=insert(Rss,#u6,#U6) 416
Rxx=insert(Rss,Rtt) 417
interleave
Rdd=interleave(Rss) 418
isync
isync 324
J
jump
if ([!]Pu.new) jump:<hint> #r15:2 218
if ([!]Pu) jump #r15:2 217
if ([!]Pu) jump:<hint> #r15:2 217
jump #r22:2 217
nt
if (Rs!=#0) jump:nt #r13:2 219
if (Rs<=#0) jump:nt #r13:2 219
if (Rs==#0) jump:nt #r13:2 219
if (Rs>=#0) jump:nt #r13:2 219
Rd=#U6 221
Rd=Rs 221
t
if (Rs!=#0) jump:t #r13:2 219
if (Rs<=#0) jump:t #r13:2 219
if (Rs==#0) jump:t #r13:2 219
if (Rs>=#0) jump:t #r13:2 219
jumpr
if ([!]Pu) jumpr Rs 209
if ([!]Pu[.new]) jumpr:<hint> Rs 209
jumpr Rs 209
jumprh
jumprh Rs 209, 210
L
l2fetch
l2fetch(Rs,Rt) 326
l2fetch(Rs,Rtt) 326
lfs
Rdd=lfs(Rss,Rtt) 419
linecpy
Rdd=linecpy(Rs,Rtt) 242
loop0
loop0(#r7:2,#U10) 198
loop0(#r7:2,Rs) 198
loop1
loop1(#r7:2,#U10) 198
loop1(#r7:2,Rs) 198
lsl
Rd=lsl(#s6,Rt) 601
Rd=lsl(Rs,Rt) 601
Rdd=lsl(Rss,Rt) 602
Rx[&|]=lsl(Rs,Rt) 607
Rx[+-]=lsl(Rs,Rt) 604
Rxx[&|]=lsl(Rss,Rt) 608
Rxx[+-]=lsl(Rss,Rt) 605
Rxx^=lsl(Rss,Rt) 608
lsr
Rd=lsr(Rs,#u5) 589
Rd=lsr(Rs,Rt) 602
Rdd=lsr(Rss,#u6) 589
Rdd=lsr(Rss,Rt) 602
Rx[&|]=lsr(Rs,#u5) 595
Rx[&|]=lsr(Rs,Rt) 607
Rx[+-]=lsr(Rs,#u5) 591
Rx[+-]=lsr(Rs,Rt) 604
Rx^=lsr(Rs,#u5) 595
Rx=add(#u8,lsr(Rx,#U5)) 591
Rx=and(#u8,lsr(Rx,#U5)) 595
Rx=or(#u8,lsr(Rx,#U5)) 595
Rx=sub(#u8,lsr(Rx,#U5)) 591
Rxx[&|]=lsr(Rss,#u6) 596
Rxx[&|]=lsr(Rss,Rt) 608
Rxx[+-]=lsr(Rss,#u6) 591
Rxx[+-]=lsr(Rss,Rt) 605
Rxx^=lsr(Rss,#u6) 596
Rxx^=lsr(Rss,Rt) 608
M
mask
Rd=mask(#u5,#U5) 588
Rdd=mask(Pt) 575
max
Rd=max(Rs,Rt) 349
Rdd=max(Rss,Rtt) 350
maxu
Rd=maxu(Rs,Rt) 349
Rdd=maxu(Rss,Rtt) 350
memb
if ([!]Pt[.new]) Rd=memb(#u6) 229
if ([!]Pt[.new]) Rd=memb(Rs+#u6:0) 229
if ([!]Pt[.new]) Rd=memb(Rx++#s4:0) 229
if ([!]Pv[.new]) memb(#u6)=Nt.new 278
if ([!]Pv[.new]) memb(#u6)=Rt 295
if ([!]Pv[.new]) memb(Rs+#u6:0)=#S6 295
if ([!]Pv[.new]) memb(Rs+#u6:0)=Nt.new 278
if ([!]Pv[.new]) memb(Rs+#u6:0)=Rt 295
if ([!]Pv[.new]) memb(Rs+Ru<<#u2)=Nt.new 278
if ([!]Pv[.new]) memb(Rs+Ru<<#u2)=Rt 295
if ([!]Pv[.new]) memb(Rx++#s4:0)=Nt.new 278
if ([!]Pv[.new]) memb(Rx++#s4:0)=Rt 295
if ([!]Pv[.new]) Rd=memb(Rs+Rt<<#u2) 229
memb(gp+#u16:0)=Nt.new 276
memb(gp+#u16:0)=Rt 293
memb(Re=#U6)=Nt.new 276
memb(Re=#U6)=Rt 293
memb(Rs+#s11:0)=Nt.new 276
memb(Rs+#s11:0)=Rt 293
memb(Rs+#u6:0)[+-]=#U5 268
memb(Rs+#u6:0)[+-|&]=Rt 268
memb(Rs+#u6:0)=#S8 293
memb(Rs+Ru<<#u2)=Nt.new 276
memb(Rs+Ru<<#u2)=Rt 293
memb(Ru<<#u2+#U6)=Nt.new 276
memb(Ru<<#u2+#U6)=Rt 293
memb(Rx++#s4:0:circ(Mu))=Nt.new 276
memb(Rx++#s4:0:circ(Mu))=Rt 293
memb(Rx++#s4:0)=Nt.new 276
memb(Rx++#s4:0)=Rt 293
memb(Rx++I:circ(Mu))=Nt.new 276
memb(Rx++I:circ(Mu))=Rt 293
memb(Rx++Mu:brev)=Nt.new 276
memb(Rx++Mu:brev)=Rt 293
memb(Rx++Mu)=Nt.new 276
memb(Rx++Mu)=Rt 293
Rd=memb(gp+#u16:0) 227
Rd=memb(Re=#U6) 227
Rd=memb(Rs+#s11:0) 227
Rd=memb(Rs+Rt<<#u2) 227
Rd=memb(Rt<<#u2+#U6) 227
Rd=memb(Rx++#s4:0:circ(Mu)) 227
Rd=memb(Rx++#s4:0) 227
Rd=memb(Rx++I:circ(Mu)) 227
Rd=memb(Rx++Mu:brev) 227
Rd=memb(Rx++Mu) 227
memb_fifo
Ryy=memb_fifo(Re=#U6) 231
Ryy=memb_fifo(Rs) 231
Ryy=memb_fifo(Rs+#s11:0) 231
Ryy=memb_fifo(Rt<<#u2+#U6) 231
Ryy=memb_fifo(Rx++#s4:0:circ(Mu)) 231
Ryy=memb_fifo(Rx++#s4:0) 231
Ryy=memb_fifo(Rx++I:circ(Mu)) 232
Ryy=memb_fifo(Rx++Mu:brev) 232
Ryy=memb_fifo(Rx++Mu) 232
membh
Rd=membh(Re=#U6) 260
Rd=membh(Rs) 260
Rd=membh(Rs+#s11:1) 260
Rd=membh(Rt<<#u2+#U6) 260
Rd=membh(Rx++#s4:1:circ(Mu)) 261
Rd=membh(Rx++#s4:1) 261
Rd=membh(Rx++I:circ(Mu)) 261
Rd=membh(Rx++Mu:brev) 261
Rd=membh(Rx++Mu) 261
Rdd=membh(Re=#U6) 263
Rdd=membh(Rs) 263
Rdd=membh(Rs+#s11:2) 263
Rdd=membh(Rt<<#u2+#U6) 263
Rdd=membh(Rx++#s4:2:circ(Mu)) 263
Rdd=membh(Rx++#s4:2) 263
Rdd=membh(Rx++I:circ(Mu)) 264
Rdd=membh(Rx++Mu:brev) 264
Rdd=membh(Rx++Mu) 264
memd
if ([!]Pt[.new]) Rdd=memd(#u6) 225
if ([!]Pt[.new]) Rdd=memd(Rs+#u6:3) 225
if ([!]Pt[.new]) Rdd=memd(Rx++#s4:3) 225
if ([!]Pv[.new]) memd(#u6)=Rtt 291
if ([!]Pv[.new]) memd(Rs+#u6:3)=Rtt 291
if ([!]Pv[.new]) memd(Rs+Ru<<#u2)=Rtt 291
if ([!]Pv[.new]) memd(Rx++#s4:3)=Rtt 291
if ([!]Pv[.new]) Rdd=memd(Rs+Rt<<#u2) 225
memd(gp+#u16:3)=Rtt 288
memd(Re=#U6)=Rtt 288
memd(Rs+#s11:3)=Rtt 288
memd(Rs+Ru<<#u2)=Rtt 288
memd(Ru<<#u2+#U6)=Rtt 288
memd(Rx++#s4:3:circ(Mu))=Rtt 288
memd(Rx++#s4:3)=Rtt 288
memd(Rx++I:circ(Mu))=Rtt 288
memd(Rx++Mu:brev)=Rtt 288
memd(Rx++Mu)=Rtt 288
Rdd=memd(gp+#u16:3) 222
Rdd=memd(Re=#U6) 222
Rdd=memd(Rs+#s11:3) 222
Rdd=memd(Rs+Rt<<#u2) 222
Rdd=memd(Rt<<#u2+#U6) 222
Rdd=memd(Rx++#s4:3:circ(Mu)) 222
Rdd=memd(Rx++#s4:3) 222
Rdd=memd(Rx++I:circ(Mu)) 222
Rdd=memd(Rx++Mu:brev) 222
Rdd=memd(Rx++Mu) 222
memd_aq
Rdd=memd_aq(Rs) 224
memd_locked
memd_locked(Rs,Pd)=Rtt 315
Rdd=memd_locked(Rs) 314
memd_rl
memd_rl(Rs):at=Rtt 290
memd_rl(Rs):st=Rtt 290
memh
if ([!]Pt[.new]) Rd=memh(#u6) 239
if ([!]Pt[.new]) Rd=memh(Rs+#u6:1) 239
if ([!]Pt[.new]) Rd=memh(Rx++#s4:1) 239
if ([!]Pv[.new]) memh(#u6)=Nt.new 282
if ([!]Pv[.new]) memh(#u6)=Rt 301
if ([!]Pv[.new]) memh(#u6)=Rt.H 301
if ([!]Pv[.new]) memh(Rs+#u6:1)=#S6 301
if ([!]Pv[.new]) memh(Rs+#u6:1)=Nt.new 282
if ([!]Pv[.new]) memh(Rs+#u6:1)=Rt 301
if ([!]Pv[.new]) memh(Rs+#u6:1)=Rt.H 301
if ([!]Pv[.new]) memh(Rs+Ru<<#u2)=Nt.new 282
if ([!]Pv[.new]) memh(Rs+Ru<<#u2)=Rt 302
if ([!]Pv[.new]) memh(Rs+Ru<<#u2)=Rt.H 301
if ([!]Pv[.new]) memh(Rx++#s4:1)=Nt.new 282
if ([!]Pv[.new]) memh(Rx++#s4:1)=Rt 302
if ([!]Pv[.new]) memh(Rx++#s4:1)=Rt.H 302
if ([!]Pv[.new]) Rd=memh(Rs+Rt<<#u2) 239
memh(gp+#u16:1)=Nt.new 280
memh(gp+#u16:1)=Rt 299
memh(gp+#u16:1)=Rt.H 299
memh(Re=#U6)=Nt.new 280
memh(Re=#U6)=Rt 298
memh(Re=#U6)=Rt.H 298
memh(Rs+#s11:1)=Nt.new 280
memh(Rs+#s11:1)=Rt 298
memh(Rs+#s11:1)=Rt.H 298
memh(Rs+#u6:1)[+-]=#U5 270
memh(Rs+#u6:1)[+-|&]=Rt 270
memh(Rs+#u6:1)=#S8 298
memh(Rs+Ru<<#u2)=Nt.new 280
memh(Rs+Ru<<#u2)=Rt 298
memh(Rs+Ru<<#u2)=Rt.H 298
memh(Ru<<#u2+#U6)=Nt.new 280
memh(Ru<<#u2+#U6)=Rt 298
memh(Ru<<#u2+#U6)=Rt.H 298
memh(Rx++#s4:1:circ(Mu))=Nt.new 280
memh(Rx++#s4:1:circ(Mu))=Rt 298
memh(Rx++#s4:1:circ(Mu))=Rt.H 298
memh(Rx++#s4:1)=Nt.new 280
memh(Rx++#s4:1)=Rt 298
memh(Rx++#s4:1)=Rt.H 298
memh(Rx++I:circ(Mu))=Nt.new 280
memh(Rx++I:circ(Mu))=Rt 298
memh(Rx++I:circ(Mu))=Rt.H 298
memh(Rx++Mu:brev)=Nt.new 280
memh(Rx++Mu:brev)=Rt 299
memh(Rx++Mu:brev)=Rt.H 299
memh(Rx++Mu)=Nt.new 280
memh(Rx++Mu)=Rt 299
memh(Rx++Mu)=Rt.H 299
Rd=memh(gp+#u16:1) 237
Rd=memh(Re=#U6) 237
Rd=memh(Rs+#s11:1) 237
Rd=memh(Rs+Rt<<#u2) 237
Rd=memh(Rt<<#u2+#U6) 237
Rd=memh(Rx++#s4:1:circ(Mu)) 237
Rd=memh(Rx++#s4:1) 237
Rd=memh(Rx++I:circ(Mu)) 237
Rd=memh(Rx++Mu:brev) 237
Rd=memh(Rx++Mu) 237
memh_fifo
Ryy=memh_fifo(Re=#U6) 234
Ryy=memh_fifo(Rs) 234
Ryy=memh_fifo(Rs+#s11:1) 234
Ryy=memh_fifo(Rt<<#u2+#U6) 234
Ryy=memh_fifo(Rx++#s4:1:circ(Mu)) 234
Ryy=memh_fifo(Rx++#s4:1) 234
Ryy=memh_fifo(Rx++I:circ(Mu)) 235
Ryy=memh_fifo(Rx++Mu:brev) 235
Ryy=memh_fifo(Rx++Mu) 235
memub
if ([!]Pt[.new]) Rd=memub(#u6) 245
if ([!]Pt[.new]) Rd=memub(Rs+#u6:0) 245
if ([!]Pt[.new]) Rd=memub(Rx++#s4:0) 245
if ([!]Pv[.new]) Rd=memub(Rs+Rt<<#u2) 245
Rd=memub(gp+#u16:0) 243
Rd=memub(Re=#U6) 243
Rd=memub(Rs+#s11:0) 243
Rd=memub(Rs+Rt<<#u2) 243
Rd=memub(Rt<<#u2+#U6) 243
Rd=memub(Rx++#s4:0:circ(Mu)) 243
Rd=memub(Rx++#s4:0) 243
Rd=memub(Rx++I:circ(Mu)) 243
Rd=memub(Rx++Mu:brev) 243
Rd=memub(Rx++Mu) 243
memubh
Rd=memubh(Re=#U6) 261
Rd=memubh(Rs+#s11:1) 262
Rd=memubh(Rt<<#u2+#U6) 262
Rd=memubh(Rx++#s4:1:circ(Mu)) 262
Rd=memubh(Rx++#s4:1) 262
Rd=memubh(Rx++I:circ(Mu)) 262
Rd=memubh(Rx++Mu:brev) 263
Rd=memubh(Rx++Mu) 262
Rdd=memubh(Re=#U6) 264
Rdd=memubh(Rs+#s11:2) 264
Rdd=memubh(Rt<<#u2+#U6) 264
Rdd=memubh(Rx++#s4:2:circ(Mu)) 265
Rdd=memubh(Rx++#s4:2) 265
Rdd=memubh(Rx++I:circ(Mu)) 265
Rdd=memubh(Rx++Mu:brev) 265
Rdd=memubh(Rx++Mu) 265
memuh
if ([!]Pt[.new]) Rd=memuh(#u6) 249
if ([!]Pt[.new]) Rd=memuh(Rs+#u6:1) 249
if ([!]Pt[.new]) Rd=memuh(Rx++#s4:1) 249
if ([!]Pv[.new]) Rd=memuh(Rs+Rt<<#u2) 249
Rd=memuh(gp+#u16:1) 247
Rd=memuh(Re=#U6) 247
Rd=memuh(Rs+#s11:1) 247
Rd=memuh(Rs+Rt<<#u2) 247
Rd=memuh(Rt<<#u2+#U6) 247
Rd=memuh(Rx++#s4:1:circ(Mu)) 247
Rd=memuh(Rx++#s4:1) 247
Rd=memuh(Rx++I:circ(Mu)) 247
Rd=memuh(Rx++Mu:brev) 247
Rd=memuh(Rx++Mu) 247
memw
if ([!]Pt[.new]) Rd=memw(#u6) 254
if ([!]Pt[.new]) Rd=memw(Rs+#u6:2) 254
if ([!]Pt[.new]) Rd=memw(Rx++#s4:2) 254
if ([!]Pv[.new]) memw(#u6)=Nt.new 286
if ([!]Pv[.new]) memw(#u6)=Rt 309
if ([!]Pv[.new]) memw(Rs+#u6:2)=#S6 309
if ([!]Pv[.new]) memw(Rs+#u6:2)=Nt.new 286
if ([!]Pv[.new]) memw(Rs+#u6:2)=Rt 309
if ([!]Pv[.new]) memw(Rs+Ru<<#u2)=Nt.new 286
if ([!]Pv[.new]) memw(Rs+Ru<<#u2)=Rt 309
if ([!]Pv[.new]) memw(Rx++#s4:2)=Nt.new 286
if ([!]Pv[.new]) memw(Rx++#s4:2)=Rt 309
if ([!]Pv[.new]) Rd=memw(Rs+Rt<<#u2) 254
memw(gp+#u16:2)=Nt.new 284
memw(gp+#u16:2)=Rt 306
memw(Re=#U6)=Nt.new 284
memw(Re=#U6)=Rt 306
memw(Rs+#s11:2)=Nt.new 284
memw(Rs+#s11:2)=Rt 306
memw(Rs+#u6:2)[+-]=#U5 271
memw(Rs+#u6:2)[+-|&]=Rt 271
memw(Rs+#u6:2)=#S8 306
memw(Rs+Ru<<#u2)=Nt.new 284
memw(Rs+Ru<<#u2)=Rt 306
memw(Ru<<#u2+#U6)=Nt.new 284
memw(Ru<<#u2+#U6)=Rt 306
memw(Rx++#s4:2:circ(Mu))=Nt.new 284
memw(Rx++#s4:2:circ(Mu))=Rt 306
memw(Rx++#s4:2)=Nt.new 284
memw(Rx++#s4:2)=Rt 306
memw(Rx++I:circ(Mu))=Nt.new 284
memw(Rx++I:circ(Mu))=Rt 306
memw(Rx++Mu:brev)=Nt.new 284
memw(Rx++Mu:brev)=Rt 306
memw(Rx++Mu)=Nt.new 284
memw(Rx++Mu)=Rt 306
Rd=memw(gp+#u16:2) 251
Rd=memw(Re=#U6) 251
Rd=memw(Rs+#s11:2) 251
Rd=memw(Rs+Rt<<#u2) 251
Rd=memw(Rt<<#u2+#U6) 251
Rd=memw(Rx++#s4:2:circ(Mu)) 251
Rd=memw(Rx++#s4:2) 251
Rd=memw(Rx++I:circ(Mu)) 251
Rd=memw(Rx++Mu:brev) 251
Rd=memw(Rx++Mu) 251
memw_aq
Rd=memw_aq(Rs) 253
memw_locked
memw_locked(Rs,Pd)=Rt 315
Rd=memw_locked(Rs) 314
memw_rl
memw_rl(Rs):at=Rt 308
memw_rl(Rs):st=Rt 308
min
Rd=min(Rt,Rs) 351
Rdd=min(Rtt,Rss) 352
minu
Rd=minu(Rt,Rs) 351
Rdd=minu(Rtt,Rss) 352
modwrap
Rd=modwrap(Rs,Rt) 353
movlen
Rd=movlen(Rs,Rtt) 242
mpy
Rd=mpy(Rs,Rt.H):<<1:rnd:sat 512
Rd=mpy(Rs,Rt.H):<<1:sat 512
Rd=mpy(Rs,Rt.L):<<1:rnd:sat 512
Rd=mpy(Rs,Rt.L):<<1:sat 512
Rd=mpy(Rs,Rt) 512
Rd=mpy(Rs,Rt):<<1 512
Rd=mpy(Rs,Rt):<<1:sat 512
Rd=mpy(Rs,Rt):rnd 512
Rd=mpy(Rs.[HL],Rt.[HL])[:<<1][:rnd][:sat] 496
Rdd=mpy(Rs,Rt) 515
Rdd=mpy(Rs.[HL],Rt.[HL])[:<<1][:rnd] 496
Rx+=mpy(Rs,Rt):<<1:sat 512
Rx+=mpy(Rs.[HL],Rt.[HL])[:<<1][:sat] 496
Rx-=mpy(Rs,Rt):<<1:sat 512
Rx-=mpy(Rs.[HL],Rt.[HL])[:<<1][:sat] 496
Rxx[+-]=mpy(Rs,Rt) 515
Rxx+=mpy(Rs.[HL],Rt.[HL])[:<<1] 496
Rxx-=mpy(Rs.[HL],Rt.[HL])[:<<1] 496
mpyi
Rd=+mpyi(Rs,#u8) 484
Rd=mpyi(Rs,#m9) 485
Rd=-mpyi(Rs,#u8) 484
Rd=mpyi(Rs,Rt) 485
Rx+=mpyi(Rs,#u8) 485
Rx+=mpyi(Rs,Rt) 485
Rx-=mpyi(Rs,#u8) 485
Rx-=mpyi(Rs,Rt) 485
mpysu
Rd=mpysu(Rs,Rt) 512
mpyu
Rd=mpyu(Rs,Rt) 512
Rd=mpyu(Rs.[HL],Rt.[HL])[:<<1] 503
Rdd=mpyu(Rs,Rt) 515
Rdd=mpyu(Rs.[HL],Rt.[HL])[:<<1] 503
Rx+=mpyu(Rs.[HL],Rt.[HL])[:<<1] 503
Rx-=mpyu(Rs.[HL],Rt.[HL])[:<<1] 503
Rxx[+-]=mpyu(Rs,Rt) 515
Rxx+=mpyu(Rs.[HL],Rt.[HL])[:<<1] 503
Rxx-=mpyu(Rs.[HL],Rt.[HL])[:<<1] 503
mpyui
Rd=mpyui(Rs,Rt) 485
mux
Rd=mux(Pu,#s8,#S8) 174
Rd=mux(Pu,#s8,Rs) 174
Rd=mux(Pu,Rs,#s8) 174
Rd=mux(Pu,Rs,Rt) 174
N
neg
Rd=neg(Rs) 160
Rd=neg(Rs):sat 354
Rdd=neg(Rss) 354
no mnemonic
Cd=Rs 205
Cdd=Rss 205
if ([!]Pu[.new]) Rd=#s12 188
if ([!]Pu[.new]) Rd=Rs 188
if ([!]Pu[.new]) Rdd=Rss 188
Pd=Ps 203
Pd=Rs 577
Rd=#s16 165
Rd=Cs 205
Rd=Ps 577
Rd=Rs 167
Rdd=#s8 165
Rdd=Css 205
Rdd=Rss 167
Rx.[HL]=#u16 165
nop
nop 161
normamt
Rd=normamt(Rs) 409
Rd=normamt(Rss) 409
not
Pd=not(Ps) 203
Rd=not(Rs) 158
Rdd=not(Rss) 343
O
or
if ([!]Pu[.new]) Rd=or(Rs,Rt) 183
Pd=and(Ps,or(Pt,[!]Pu)) 203
Pd=or(Ps,and(Pt,[!]Pu)) 203
Pd=or(Ps,or(Pt,[!]Pu)) 203
Pd=or(Pt,[!]Ps) 203
Rd=or(Rs,#s10) 158
Rd=or(Rs,Rt) 158
Rd=or(Rt,~Rs) 158
Rdd=or(Rss,Rtt) 343
Rdd=or(Rtt,~Rss) 343
Rx[&|^]=or(Rs,Rt) 346
Rx=or(Ru,and(Rx,#s10)) 346
Rx|=or(Rs,#s10) 346
P
packhl
Rdd=packhl(Rs,Rt) 177
parity
Rd=parity(Rs,Rt) 420
Rd=parity(Rss,Rtt) 420
pause
pause(#u10) 328
pc
Rd=add(pc,#u6) 200
pmemcpy
Rdd=pmemcpy(Rx,Rtt) 241, 242
pmpyw
Rdd=pmpyw(Rs,Rt) 508
Rxx^=pmpyw(Rs,Rt) 508
popcount
Rd=popcount(Rss) 411
R
release
release(Rs):at 305
release(Rs):st 305
rol
Rd=rol(Rs,#u5) 589
Rdd=rol(Rss,#u6) 589
Rx[&|]=rol(Rs,#u5) 595
Rx[+-]=rol(Rs,#u5) 591
Rx^=rol(Rs,#u5) 595
Rxx[&|]=rol(Rss,#u6) 596
Rxx[+-]=rol(Rss,#u6) 591
Rxx^=rol(Rss,#u6) 596
round
Rd=round(Rs,#u5)[:sat] 355
Rd=round(Rs,Rt)[:sat] 355
Rd=round(Rss):sat 355
S
sat
Rd=sat(Rss) 542
satb
Rd=satb(Rs) 542
sath
Rd=sath(Rs) 542
satub
Rd=satub(Rs) 542
satuh
Rd=satuh(Rs) 542
setbit
memb(Rs+#u6:0)=setbit(#U5) 268
memh(Rs+#u6:1)=setbit(#U5) 270
memw(Rs+#u6:2)=setbit(#U5) 271
Rd=setbit(Rs,#u5) 422
Rd=setbit(Rs,Rt) 422
sfadd
Rd=sfadd(Rs,Rt) 461
sfclass
Pd=sfclass(Rs,#u5) 462
sfcmp.eq
Pd=sfcmp.eq(Rs,Rt) 464
sfcmp.ge
Pd=sfcmp.ge(Rs,Rt) 464
sfcmp.gt
Pd=sfcmp.gt(Rs,Rt) 464
sfcmp.uo
Pd=sfcmp.uo(Rs,Rt) 464
sffixupd
Rd=sffixupd(Rs,Rt) 472
sffixupn
Rd=sffixupn(Rs,Rt) 472
sffixupr
Rd=sffixupr(Rs) 472
sfinvsqrta
Rd,Pe=sfinvsqrta(Rs) 475
sfmake
Rd=sfmake(#u10):neg 478
Rd=sfmake(#u10):pos 478
sfmax
Rd=sfmax(Rs,Rt) 479
sfmin
Rd=sfmin(Rs,Rt) 480
sfmpy
Rd=sfmpy(Rs,Rt) 481
Rx+=sfmpy(Rs,Rt,Pu):scale 474
Rx+=sfmpy(Rs,Rt) 473
Rx+=sfmpy(Rs,Rt):lib 476
Rx-=sfmpy(Rs,Rt) 473
Rx-=sfmpy(Rs,Rt):lib 476
sfrecipa
Rd,Pe=sfrecipa(Rs,Rt) 482
sfsub
Rd=sfsub(Rs,Rt) 483
shuffeb
Rdd=shuffeb(Rss,Rtt) 555
shuffeh
Rdd=shuffeh(Rss,Rtt) 555
shuffob
Rdd=shuffob(Rtt,Rss) 555
shuffoh
Rdd=shuffoh(Rtt,Rss) 555
sp1loop0
p3=sp1loop0(#r7:2,#U10) 201
p3=sp1loop0(#r7:2,Rs) 201
sp2loop0
p3=sp2loop0(#r7:2,#U10) 201
p3=sp2loop0(#r7:2,Rs) 201
sp3loop0
p3=sp3loop0(#r7:2,#U10) 201
p3=sp3loop0(#r7:2,Rs) 201
sub
if ([!]Pu[.new]) Rd=sub(Rt,Rs) 185
Rd=add(Rs,sub(#s6,Ru)) 335
Rd=sub(#s10,Rs) 162
Rd=sub(Rt,Rs) 162
Rd=sub(Rt,Rs):sat 162
Rd=sub(Rt,Rs):sat:deprecated 358
Rd=sub(Rt.[HL],Rs.[HL])[:sat]:<<16 360
Rd=sub(Rt.L,Rs.[HL])[:sat] 360
Rdd=sub(Rss,Rtt,Px):carry 341
Rdd=sub(Rtt,Rss) 358
Rx+=sub(Rt,Rs) 359
swiz
Rd=swiz(Rs) 544
sxtb
if ([!]Pu[.new]) Rd=sxtb(Rs) 186
Rd=sxtb(Rs) 164
sxth
if ([!]Pu[.new]) Rd=sxth(Rs) 186
Rd=sxth(Rs) 164
sxtw
Rdd=sxtw(Rs) 362
syncht
syncht 329
T
tableidxb
Rx=tableidxb(Rs,#u4,#S6):raw 426
Rx=tableidxb(Rs,#u4,#U5) 426
tableidxd
Rx=tableidxd(Rs,#u4,#S6):raw 426
Rx=tableidxd(Rs,#u4,#U5) 427
tableidxh
Rx=tableidxh(Rs,#u4,#S6):raw 427
Rx=tableidxh(Rs,#u4,#U5) 427
tableidxw
Rx=tableidxw(Rs,#u4,#S6):raw 427
Rx=tableidxw(Rs,#u4,#U5) 427
tlbmatch
Pd=tlbmatch(Rss,Rt) 576
togglebit
Rd=togglebit(Rs,#u5) 422
Rd=togglebit(Rs,Rt) 422
trace
trace(Rs) 330
trap0
trap0(#u8) 331
trap1
trap1(#u8) 331
trap1(Rx,#u8) 331
tstbit
if ([!]tstbit(Ns.new,#0)) jump:<hint> #r9:2 273
p[01]=tstbit(Rs,#0) 214
Pd=[!]tstbit(Rs,#u5) 578
Pd=[!]tstbit(Rs,Rt) 578
U
unpause
unpause 332
V
vabsdiffb
Rdd=vabsdiffb(Rtt,Rss) 365
vabsdiffh
Rdd=vabsdiffh(Rtt,Rss) 366
vabsdiffub
Rdd=vabsdiffub(Rtt,Rss) 365
vabsdiffw
Rdd=vabsdiffw(Rtt,Rss) 367
vabsh
Rdd=vabsh(Rss) 363
Rdd=vabsh(Rss):sat 363
vabsw
Rdd=vabsw(Rss) 364
Rdd=vabsw(Rss):sat 364
vacsh
Rxx,Pe=vacsh(Rss,Rtt) 369
vaddb
Rdd=vaddb(Rss,Rtt) 378
vaddh
Rd=vaddh(Rs,Rt)[:sat] 168
Rdd=vaddh(Rss,Rtt)[:sat] 371
vaddhub
Rd=vaddhub(Rss,Rtt):sat 373
vaddub
Rdd=vaddub(Rss,Rtt)[:sat] 378
vadduh
Rd=vadduh(Rs,Rt):sat 168
Rdd=vadduh(Rss,Rtt):sat 371
vaddw
Rdd=vaddw(Rss,Rtt)[:sat] 379
valignb
Rdd=valignb(Rtt,Rss,#u3) 545
Rdd=valignb(Rtt,Rss,Pu) 545
vaslh
Rdd=vaslh(Rss,#u4) 611
Rdd=vaslh(Rss,Rt) 615
vaslw
Rdd=vaslw(Rss,#u5) 617
Rdd=vaslw(Rss,Rt) 618
vasrh
Rdd=vasrh(Rss,#u4) 611
Rdd=vasrh(Rss,#u4):raw 612
Rdd=vasrh(Rss,#u4):rnd 612
Rdd=vasrh(Rss,Rt) 615
vasrhub
Rd=vasrhub(Rss,#u4):raw 613
Rd=vasrhub(Rss,#u4):rnd:sat 613
Rd=vasrhub(Rss,#u4):sat 613
vasrw
Rd=vasrw(Rss,#u5) 620
Rd=vasrw(Rss,Rt) 620
Rdd=vasrw(Rss,#u5) 617
Rdd=vasrw(Rss,Rt) 618
vavgh
Rd=vavgh(Rs,Rt) 169
Rd=vavgh(Rs,Rt):rnd 169
Rdd=vavgh(Rss,Rtt) 380
Rdd=vavgh(Rss,Rtt):crnd 380
Rdd=vavgh(Rss,Rtt):rnd 380
vavgub
Rdd=vavgub(Rss,Rtt) 382
Rdd=vavgub(Rss,Rtt):rnd 382
vavguh
Rdd=vavguh(Rss,Rtt) 380
Rdd=vavguh(Rss,Rtt):rnd 380
vavguw
Rdd=vavguw(Rss,Rtt)[:rnd] 383
vavgw
Rdd=vavgw(Rss,Rtt):crnd 383
Rdd=vavgw(Rss,Rtt)[:rnd] 383
vclip
Rdd=vclip(Rss,#u5) 385
vcmpb.eq
Pd=!any8(vcmpb.eq(Rss,Rtt)) 581
Pd=any8(vcmpb.eq(Rss,Rtt)) 581
Pd=vcmpb.eq(Rss,#u8) 582
Pd=vcmpb.eq(Rss,Rtt) 582
vcmpb.gt
Pd=vcmpb.gt(Rss,#s8) 582
Pd=vcmpb.gt(Rss,Rtt) 582
vcmpb.gtu
Pd=vcmpb.gtu(Rss,#u7) 582
Pd=vcmpb.gtu(Rss,Rtt) 582
vcmph.eq
Pd=vcmph.eq(Rss,#s8) 579
Pd=vcmph.eq(Rss,Rtt) 579
vcmph.gt
Pd=vcmph.gt(Rss,#s8) 579
Pd=vcmph.gt(Rss,Rtt) 579
vcmph.gtu
Pd=vcmph.gtu(Rss,#u7) 579
Pd=vcmph.gtu(Rss,Rtt) 579
vcmpw.eq
Pd=vcmpw.eq(Rss,#s8) 584
Pd=vcmpw.eq(Rss,Rtt) 584
vcmpw.gt
Pd=vcmpw.gt(Rss,#s8) 584
Pd=vcmpw.gt(Rss,Rtt) 584
vcmpw.gtu
Pd=vcmpw.gtu(Rss,#u7) 584
Pd=vcmpw.gtu(Rss,Rtt) 584
vcmpyi
Rdd=vcmpyi(Rss,Rtt)[:<<1]:sat 447
Rxx+=vcmpyi(Rss,Rtt):sat 448
vcmpyr
Rdd=vcmpyr(Rss,Rtt)[:<<1]:sat 447
Rxx+=vcmpyr(Rss,Rtt):sat 448
vcnegh
Rdd=vcnegh(Rss,Rt) 386
vconj
Rdd=vconj(Rss):sat 450
vcrotate
Rdd=vcrotate(Rss,Rt) 451
vdmpy
Rd=vdmpy(Rss,Rtt)[:<<1]:rnd:sat 520
Rdd=vdmpy(Rss,Rtt):<<1:sat 517
Rdd=vdmpy(Rss,Rtt):sat 517
Rxx+=vdmpy(Rss,Rtt):<<1:sat 518
Rxx+=vdmpy(Rss,Rtt):sat 518
vdmpybsu
Rdd=vdmpybsu(Rss,Rtt):sat 524
Rxx+=vdmpybsu(Rss,Rtt):sat 524
vitpack
Rd=vitpack(Ps,Pt) 586
vlslh
Rdd=vlslh(Rss,Rt) 615
vlslw
Rdd=vlslw(Rss,Rt) 618
vlsrh
Rdd=vlsrh(Rss,#u4) 611
Rdd=vlsrh(Rss,Rt) 615
vlsrw
Rdd=vlsrw(Rss,#u5) 617
Rdd=vlsrw(Rss,Rt) 618
vmaxb
Rdd=vmaxb(Rtt,Rss) 388
vmaxh
Rdd=vmaxh(Rtt,Rss) 389
vmaxub
Rdd=vmaxub(Rtt,Rss) 388
vmaxuh
Rdd=vmaxuh(Rtt,Rss) 389
vmaxuw
Rdd=vmaxuw(Rtt,Rss) 394
vmaxw
Rdd=vmaxw(Rtt,Rss) 394
vminb
Rdd=vminb(Rtt,Rss) 395
vminh
Rdd=vminh(Rtt,Rss) 397
vminub
Rdd,Pe=vminub(Rtt,Rss) 395
Rdd=vminub(Rtt,Rss) 395
vminuh
Rdd=vminuh(Rtt,Rss) 397
vminuw
Rdd=vminuw(Rtt,Rss) 402
vminw
Rdd=vminw(Rtt,Rss) 402
vmpybsu
Rdd=vmpybsu(Rs,Rt) 536
Rxx+=vmpybsu(Rs,Rt) 536
vmpybu
Rdd=vmpybu(Rs,Rt) 536
Rxx+=vmpybu(Rs,Rt) 536
vmpyeh
Rdd=vmpyeh(Rss,Rtt):<<1:sat 526
Rdd=vmpyeh(Rss,Rtt):sat 526
Rxx+=vmpyeh(Rss,Rtt) 526
Rxx+=vmpyeh(Rss,Rtt):<<1:sat 526
Rxx+=vmpyeh(Rss,Rtt):sat 526
vmpyh
Rd=vmpyh(Rs,Rt)[:<<1]:rnd:sat 530
Rdd=vmpyh(Rs,Rt)[:<<1]:sat 528
Rxx+=vmpyh(Rs,Rt) 528
Rxx+=vmpyh(Rs,Rt)[:<<1]:sat 528
vmpyhsu
Rdd=vmpyhsu(Rs,Rt)[:<<1]:sat 532
Rxx+=vmpyhsu(Rs,Rt)[:<<1]:sat 532
vmpyweh
Rdd=vmpyweh(Rss,Rtt)[:<<1]:rnd:sat 488
Rdd=vmpyweh(Rss,Rtt)[:<<1]:sat 489
Rxx+=vmpyweh(Rss,Rtt)[:<<1]:rnd:sat 489
Rxx+=vmpyweh(Rss,Rtt)[:<<1]:sat 489
vmpyweuh
Rdd=vmpyweuh(Rss,Rtt)[:<<1]:rnd:sat 492
Rdd=vmpyweuh(Rss,Rtt)[:<<1]:sat 493
Rxx+=vmpyweuh(Rss,Rtt)[:<<1]:rnd:sat 493
Rxx+=vmpyweuh(Rss,Rtt)[:<<1]:sat 493
vmpywoh
Rdd=vmpywoh(Rss,Rtt)[:<<1]:rnd:sat 489
Rdd=vmpywoh(Rss,Rtt)[:<<1]:sat 489
Rxx+=vmpywoh(Rss,Rtt)[:<<1]:rnd:sat 489
Rxx+=vmpywoh(Rss,Rtt)[:<<1]:sat 489
vmpywouh
Rdd=vmpywouh(Rss,Rtt)[:<<1]:rnd:sat 493
Rdd=vmpywouh(Rss,Rtt)[:<<1]:sat 493
Rxx+=vmpywouh(Rss,Rtt)[:<<1]:rnd:sat 493
Rxx+=vmpywouh(Rss,Rtt)[:<<1]:sat 493
vmux
Rdd=vmux(Pu,Rss,Rtt) 587
vnavgh
Rd=vnavgh(Rt,Rs) 169
Rdd=vnavgh(Rtt,Rss) 380
Rdd=vnavgh(Rtt,Rss):crnd:sat 380
Rdd=vnavgh(Rtt,Rss):rnd:sat 380
vnavgw
Rdd=vnavgw(Rtt,Rss) 383
Rdd=vnavgw(Rtt,Rss):crnd:sat 383
Rdd=vnavgw(Rtt,Rss):rnd:sat 383
vpmpyh
Rdd=vpmpyh(Rs,Rt) 538
Rxx^=vpmpyh(Rs,Rt) 539
vraddh
Rd=vraddh(Rss,Rtt) 376
vraddub
Rdd=vraddub(Rss,Rtt) 374
Rxx+=vraddub(Rss,Rtt) 374
vradduh
Rd=vradduh(Rss,Rtt) 376
vrcmpys
Rd=vrcmpys(Rss,Rt):<<1:rnd:sat 456
Rd=vrcmpys(Rss,Rtt):<<1:rnd:sat:raw:hi 456
Rd=vrcmpys(Rss,Rtt):<<1:rnd:sat:raw:lo 457
Rdd=vrcmpys(Rss,Rt):<<1:sat 453
Rdd=vrcmpys(Rss,Rtt):<<1:sat:raw:hi 453
Rdd=vrcmpys(Rss,Rtt):<<1:sat:raw:lo 454
Rxx+=vrcmpys(Rss,Rt):<<1:sat 454
Rxx+=vrcmpys(Rss,Rtt):<<1:sat:raw:hi 454
Rxx+=vrcmpys(Rss,Rtt):<<1:sat:raw:lo 454
vrcnegh
Rxx+=vrcnegh(Rss,Rt) 386
vrcrotate
Rdd=vrcrotate(Rss,Rt,#u2) 459
Rxx+=vrcrotate(Rss,Rt,#u2) 459
vrmaxh
Rxx=vrmaxh(Rss,Ru) 390
vrmaxuh
Rxx=vrmaxuh(Rss,Ru) 390
vrmaxuw
Rxx=vrmaxuw(Rss,Ru) 392
vrmaxw
Rxx=vrmaxw(Rss,Ru) 392
vrminh
Rxx=vrminh(Rss,Ru) 398
vrminuh
Rxx=vrminuh(Rss,Ru) 398
vrminuw
Rxx=vrminuw(Rss,Ru) 400
vrminw
Rxx=vrminw(Rss,Ru) 400
vrmpybsu
Rdd=vrmpybsu(Rss,Rtt) 522
Rxx+=vrmpybsu(Rss,Rtt) 522
vrmpybu
Rdd=vrmpybu(Rss,Rtt) 522
Rxx+=vrmpybu(Rss,Rtt) 522
vrmpyh
Rdd=vrmpyh(Rss,Rtt) 534
Rxx+=vrmpyh(Rss,Rtt) 534
vrmpyweh
Rdd=vrmpyweh(Rss,Rtt)[:<<1] 510
Rxx+=vrmpyweh(Rss,Rtt)[:<<1] 510
vrmpywoh
Rdd=vrmpywoh(Rss,Rtt)[:<<1] 510
Rxx+=vrmpywoh(Rss,Rtt)[:<<1] 510
vrndwh
Rd=vrndwh(Rss) 547
Rd=vrndwh(Rss):sat 547
vrsadub
Rdd=vrsadub(Rss,Rtt) 403
Rxx+=vrsadub(Rss,Rtt) 403
vsathb
Rd=vsathb(Rs) 550
Rd=vsathb(Rss) 550
Rdd=vsathb(Rss) 553
vsathub
Rd=vsathub(Rs) 550
Rd=vsathub(Rss) 550
Rdd=vsathub(Rss) 553
vsatwh
Rd=vsatwh(Rss) 550
Rdd=vsatwh(Rss) 553
vsatwuh
Rd=vsatwuh(Rss) 550
Rdd=vsatwuh(Rss) 553
vsplatb
Rd=vsplatb(Rs) 557
Rdd=vsplatb(Rs) 557
vsplath
Rdd=vsplath(Rs) 558
vspliceb
Rdd=vspliceb(Rss,Rtt,#u3) 559
Rdd=vspliceb(Rss,Rtt,Pu) 559
vsubb
Rdd=vsubb(Rss,Rtt) 407
vsubh
Rd=vsubh(Rt,Rs)[:sat] 170
Rdd=vsubh(Rtt,Rss)[:sat] 405
vsubub
Rdd=vsubub(Rtt,Rss)[:sat] 407
vsubuh
Rd=vsubuh(Rt,Rs):sat 170
Rdd=vsubuh(Rtt,Rss):sat 405
vsubw
Rdd=vsubw(Rtt,Rss)[:sat] 408
vsxtbh
Rdd=vsxtbh(Rs) 561
vsxthw
Rdd=vsxthw(Rs) 561
vtrunehb
Rd=vtrunehb(Rss) 563
Rdd=vtrunehb(Rss,Rtt) 563
vtrunewh
Rdd=vtrunewh(Rss,Rtt) 563
vtrunohb
Rd=vtrunohb(Rss) 563
Rdd=vtrunohb(Rss,Rtt) 563
vtrunowh
Rdd=vtrunowh(Rss,Rtt) 564
vxaddsubh
Rdd=vxaddsubh(Rss,Rtt):rnd:>>1:sat 429
Rdd=vxaddsubh(Rss,Rtt):sat 429
vxaddsubw
Rdd=vxaddsubw(Rss,Rtt):sat 431
vxsubaddh
Rdd=vxsubaddh(Rss,Rtt):rnd:>>1:sat 429
Rdd=vxsubaddh(Rss,Rtt):sat 429
vxsubaddw
Rdd=vxsubaddw(Rss,Rtt):sat 431
vzxtbh
Rdd=vzxtbh(Rs) 565
vzxthw
Rdd=vzxthw(Rs) 565
X
xor
if ([!]Pu[.new]) Rd=xor(Rs,Rt) 183
Pd=xor(Ps,Pt) 203
Rd=xor(Rs,Rt) 158
Rdd=xor(Rss,Rtt) 343
Rx[&|^]=xor(Rs,Rt) 346
Rxx^=xor(Rss,Rtt) 345
Z
zxtb
if ([!]Pu[.new]) Rd=zxtb(Rs) 189
Rd=zxtb(Rs) 171
zxth
if ([!]Pu[.new]) Rd=zxth(Rs) 189
Rd=zxth(Rs) 171
add
Rd=add(#u6,mpyi(Rs,#U6))
Word32 Q6_R_add_mpyi_IRI(Word32 Iu6, Word32 Rs, Word32 IU6) 486
Rd=add(#u6,mpyi(Rs,Rt))
Word32 Q6_R_add_mpyi_IRR(Word32 Iu6, Word32 Rs, Word32 Rt) 486
Rd=add(Rs,#s16)
Word32 Q6_R_add_RI(Word32 Rs, Word32 Is16) 156
Rd=add(Rs,add(Ru,#s6))
Word32 Q6_R_add_add_RRI(Word32 Rs, Word32 Ru, Word32 Is6) 335
Rd=add(Rs,Rt)
Word32 Q6_R_add_RR(Word32 Rs, Word32 Rt) 156
Rd=add(Rs,Rt):sat
Word32 Q6_R_add_RR_sat(Word32 Rs, Word32 Rt) 156
Rd=add(Rt.H,Rs.H):<<16
Word32 Q6_R_add_RhRh_s16(Word32 Rt, Word32 Rs) 340
Rd=add(Rt.H,Rs.H):sat:<<16
Word32 Q6_R_add_RhRh_sat_s16(Word32 Rt, Word32 Rs) 340
Rd=add(Rt.H,Rs.L):<<16
Word32 Q6_R_add_RhRl_s16(Word32 Rt, Word32 Rs) 340
Rd=add(Rt.H,Rs.L):sat:<<16
Word32 Q6_R_add_RhRl_sat_s16(Word32 Rt, Word32 Rs) 340
Rd=add(Rt.L,Rs.H)
Word32 Q6_R_add_RlRh(Word32 Rt, Word32 Rs) 340
Rd=add(Rt.L,Rs.H):<<16
Word32 Q6_R_add_RlRh_s16(Word32 Rt, Word32 Rs) 340
Rd=add(Rt.L,Rs.H):sat
Word32 Q6_R_add_RlRh_sat(Word32 Rt, Word32 Rs) 340
Rd=add(Rt.L,Rs.H):sat:<<16
Word32 Q6_R_add_RlRh_sat_s16(Word32 Rt, Word32 Rs) 340
Rd=add(Rt.L,Rs.L)
Word32 Q6_R_add_RlRl(Word32 Rt, Word32 Rs) 340
Rd=add(Rt.L,Rs.L):<<16
Word32 Q6_R_add_RlRl_s16(Word32 Rt, Word32 Rs) 340
Rd=add(Rt.L,Rs.L):sat
Word32 Q6_R_add_RlRl_sat(Word32 Rt, Word32 Rs) 340
Rd=add(Rt.L,Rs.L):sat:<<16
Word32 Q6_R_add_RlRl_sat_s16(Word32 Rt, Word32 Rs) 340
Rd=add(Ru,mpyi(#u6:2,Rs))
Word32 Q6_R_add_mpyi_RIR(Word32 Ru, Word32 Iu6_2, Word32 Rs) 486
Rd=add(Ru,mpyi(Rs,#u6))
Word32 Q6_R_add_mpyi_RRI(Word32 Ru, Word32 Rs, Word32 Iu6) 486
Rdd=add(Rs,Rtt)
addasl
Rd=addasl(Rt,Rs,#u3)
Word32 Q6_R_addasl_RRI(Word32 Rt, Word32 Rs, Word32 Iu3) 594
all8
Pd=all8(Ps)
Byte Q6_p_all8_p(Byte Ps) 197
and
Pd=and(Ps,and(Pt,!Pu))
Byte Q6_p_and_and_ppnp(Byte Ps, Byte Pt, Byte Pu) 203
Pd=and(Ps,and(Pt,Pu))
Byte Q6_p_and_and_ppp(Byte Ps, Byte Pt, Byte Pu) 203
Pd=and(Pt,!Ps)
Byte Q6_p_and_pnp(Byte Pt, Byte Ps) 203
Pd=and(Pt,Ps)
Byte Q6_p_and_pp(Byte Pt, Byte Ps) 203
Rd=and(Rs,#s10)
Word32 Q6_R_and_RI(Word32 Rs, Word32 Is10) 158
Rd=and(Rs,Rt)
Word32 Q6_R_and_RR(Word32 Rs, Word32 Rt) 158
Rd=and(Rt,~Rs)
Word32 Q6_R_and_RnR(Word32 Rt, Word32 Rs) 158
Rdd=and(Rss,Rtt)
Word64 Q6_P_and_PP(Word64 Rss, Word64 Rtt) 343
Rdd=and(Rtt,~Rss)
Word64 Q6_P_and_PnP(Word64 Rtt, Word64 Rss) 343
Rx&=and(Rs,~Rt)
Word32 Q6_R_andand_RnR(Word32 Rx, Word32 Rs, Word32 Rt) 347
Rx&=and(Rs,Rt)
Word32 Q6_R_andand_RR(Word32 Rx, Word32 Rs, Word32 Rt) 347
Rx^=and(Rs,~Rt)
Word32 Q6_R_andxacc_RnR(Word32 Rx, Word32 Rs, Word32 Rt) 347
Rx^=and(Rs,Rt)
Word32 Q6_R_andxacc_RR(Word32 Rx, Word32 Rs, Word32 Rt) 347
Rx|=and(Rs,#s10)
Word32 Q6_R_andor_RI(Word32 Rx, Word32 Rs, Word32 Is10) 347
Rx|=and(Rs,~Rt)
Word32 Q6_R_andor_RnR(Word32 Rx, Word32 Rs, Word32 Rt) 347
Rx|=and(Rs,Rt)
Word32 Q6_R_andor_RR(Word32 Rx, Word32 Rs, Word32 Rt) 347
any8
Pd=any8(Ps)
Byte Q6_p_any8_p(Byte Ps) 197
asl
Rd=asl(Rs,#u5)
Word32 Q6_R_asl_RI(Word32 Rs, Word32 Iu5) 589
Rd=asl(Rs,#u5):sat
Word32 Q6_R_asl_RI_sat(Word32 Rs, Word32 Iu5) 600
Rd=asl(Rs,Rt)
Word32 Q6_R_asl_RR(Word32 Rs, Word32 Rt) 602
Rd=asl(Rs,Rt):sat
Word32 Q6_R_asl_RR_sat(Word32 Rs, Word32 Rt) 610
Rdd=asl(Rss,#u6)
Word64 Q6_P_asl_PI(Word64 Rss, Word32 Iu6) 590
Rdd=asl(Rss,Rt)
Word64 Q6_P_asl_PR(Word64 Rss, Word32 Rt) 602
Rx&=asl(Rs,#u5)
Word32 Q6_R_asland_RI(Word32 Rx, Word32 Rs, Word32 Iu5) 596
Rx&=asl(Rs,Rt)
Word32 Q6_R_asland_RR(Word32 Rx, Word32 Rs, Word32 Rt) 608
Rx^=asl(Rs,#u5)
Word32 Q6_R_aslxacc_RI(Word32 Rx, Word32 Rs, Word32 Iu5) 596
Rx+=asl(Rs,#u5)
Word32 Q6_R_aslacc_RI(Word32 Rx, Word32 Rs, Word32 Iu5) 592
Rx+=asl(Rs,Rt)
Word32 Q6_R_aslacc_RR(Word32 Rx, Word32 Rs, Word32 Rt) 605
Rx=add(#u8,asl(Rx,#U5))
Word32 Q6_R_add_asl_IRI(Word32 Iu8, Word32 Rx, Word32 IU5) 592
Rx=and(#u8,asl(Rx,#U5))
Word32 Q6_R_and_asl_IRI(Word32 Iu8, Word32 Rx, Word32 IU5) 596
Rx-=asl(Rs,#u5)
Word32 Q6_R_aslnac_RI(Word32 Rx, Word32 Rs, Word32 Iu5) 592
Rx-=asl(Rs,Rt)
Word32 Q6_R_aslnac_RR(Word32 Rx, Word32 Rs, Word32 Rt) 605
Rx=or(#u8,asl(Rx,#U5))
Word32 Q6_R_or_asl_IRI(Word32 Iu8, Word32 Rx, Word32 IU5) 596
Rx=sub(#u8,asl(Rx,#U5))
Word32 Q6_R_sub_asl_IRI(Word32 Iu8, Word32 Rx, Word32 IU5) 592
Rx|=asl(Rs,#u5)
Word32 Q6_R_aslor_RI(Word32 Rx, Word32 Rs, Word32 Iu5) 596
Rx|=asl(Rs,Rt)
Word32 Q6_R_aslor_RR(Word32 Rx, Word32 Rs, Word32 Rt) 608
Rxx&=asl(Rss,#u6)
Word64 Q6_P_asland_PI(Word64 Rxx, Word64 Rss, Word32 Iu6) 596
Rxx&=asl(Rss,Rt)
Word64 Q6_P_asland_PR(Word64 Rxx, Word64 Rss, Word32 Rt) 608
Rxx^=asl(Rss,#u6)
Word64 Q6_P_aslxacc_PI(Word64 Rxx, Word64 Rss, Word32 Iu6) 596
Rxx^=asl(Rss,Rt)
Word64 Q6_P_aslxacc_PR(Word64 Rxx, Word64 Rss, Word32 Rt) 608
Rxx+=asl(Rss,#u6)
Word64 Q6_P_aslacc_PI(Word64 Rxx, Word64 Rss, Word32 Iu6) 592
Rxx+=asl(Rss,Rt)
Word64 Q6_P_aslacc_PR(Word64 Rxx, Word64 Rss, Word32 Rt) 605
Rxx-=asl(Rss,#u6)
Word64 Q6_P_aslnac_PI(Word64 Rxx, Word64 Rss, Word32 Iu6) 592
Rxx-=asl(Rss,Rt)
aslh
Rd=aslh(Rs)
Word32 Q6_R_aslh_R(Word32 Rs) 176
asr
Rd=asr(Rs,#u5)
Word32 Q6_R_asr_RI(Word32 Rs, Word32 Iu5) 589
Rd=asr(Rs,#u5):rnd
Word32 Q6_R_asr_RI_rnd(Word32 Rs, Word32 Iu5) 599
Rd=asr(Rs,Rt)
Word32 Q6_R_asr_RR(Word32 Rs, Word32 Rt) 602
Rd=asr(Rs,Rt):sat
Word32 Q6_R_asr_RR_sat(Word32 Rs, Word32 Rt) 610
Rdd=asr(Rss,#u6)
Word64 Q6_P_asr_PI(Word64 Rss, Word32 Iu6) 590
Rdd=asr(Rss,#u6):rnd
Word64 Q6_P_asr_PI_rnd(Word64 Rss, Word32 Iu6) 599
Rdd=asr(Rss,Rt)
Word64 Q6_P_asr_PR(Word64 Rss, Word32 Rt) 602
Rx&=asr(Rs,#u5)
Word32 Q6_R_asrand_RI(Word32 Rx, Word32 Rs, Word32 Iu5) 596
Rx&=asr(Rs,Rt)
Word32 Q6_R_asrand_RR(Word32 Rx, Word32 Rs, Word32 Rt) 608
Rx+=asr(Rs,#u5)
Word32 Q6_R_asracc_RI(Word32 Rx, Word32 Rs, Word32 Iu5) 592
Rx+=asr(Rs,Rt)
Word32 Q6_R_asracc_RR(Word32 Rx, Word32 Rs, Word32 Rt) 605
Rx-=asr(Rs,#u5)
Word32 Q6_R_asrnac_RI(Word32 Rx, Word32 Rs, Word32 Iu5) 592
Rx-=asr(Rs,Rt)
Word32 Q6_R_asrnac_RR(Word32 Rx, Word32 Rs, Word32 Rt) 605
Rx|=asr(Rs,#u5)
Word32 Q6_R_asror_RI(Word32 Rx, Word32 Rs, Word32 Iu5) 596
Rx|=asr(Rs,Rt)
Word32 Q6_R_asror_RR(Word32 Rx, Word32 Rs, Word32 Rt) 608
Rxx&=asr(Rss,#u6)
Word64 Q6_P_asrand_PI(Word64 Rxx, Word64 Rss, Word32 Iu6) 596
Rxx&=asr(Rss,Rt)
Word64 Q6_P_asrand_PR(Word64 Rxx, Word64 Rss, Word32 Rt) 608
Rxx^=asr(Rss,Rt)
Word64 Q6_P_asrxacc_PR(Word64 Rxx, Word64 Rss, Word32 Rt) 608
Rxx+=asr(Rss,#u6)
Word64 Q6_P_asracc_PI(Word64 Rxx, Word64 Rss, Word32 Iu6) 592
Rxx+=asr(Rss,Rt)
Word64 Q6_P_asracc_PR(Word64 Rxx, Word64 Rss, Word32 Rt) 605
Rxx-=asr(Rss,#u6)
Word64 Q6_P_asrnac_PI(Word64 Rxx, Word64 Rss, Word32 Iu6) 592
Rxx-=asr(Rss,Rt)
Word64 Q6_P_asrnac_PR(Word64 Rxx, Word64 Rss, Word32 Rt) 605
Rxx|=asr(Rss,#u6)
Word64 Q6_P_asror_PI(Word64 Rxx, Word64 Rss, Word32 Iu6) 597
Rxx|=asr(Rss,Rt)
Word64 Q6_P_asror_PR(Word64 Rxx, Word64 Rss, Word32 Rt) 609
asrh
Rd=asrh(Rs)
Word32 Q6_R_asrh_R(Word32 Rs) 176
asrrnd
Rd=asrrnd(Rs,#u5)
Word32 Q6_R_asrrnd_RI(Word32 Rs, Word32 Iu5) 599
Rdd=asrrnd(Rss,#u6)
Word64 Q6_P_asrrnd_PI(Word64 Rss, Word32 Iu6) 599
B
bitsclr
Pd=!bitsclr(Rs,#u6)
Byte Q6_p_not_bitsclr_RI(Word32 Rs, Word32 Iu6) 574
Pd=!bitsclr(Rs,Rt)
Byte Q6_p_not_bitsclr_RR(Word32 Rs, Word32 Rt) 574
Pd=bitsclr(Rs,#u6)
Byte Q6_p_bitsclr_RI(Word32 Rs, Word32 Iu6) 574
Pd=bitsclr(Rs,Rt)
Byte Q6_p_bitsclr_RR(Word32 Rs, Word32 Rt) 574
bitsplit
Rdd=bitsplit(Rs,#u5)
Word64 Q6_P_bitsplit_RI(Word32 Rs, Word32 Iu5) 424
Rdd=bitsplit(Rs,Rt)
Word64 Q6_P_bitsplit_RR(Word32 Rs, Word32 Rt) 424
bitsset
Pd=!bitsset(Rs,Rt)
Byte Q6_p_not_bitsset_RR(Word32 Rs, Word32 Rt) 574
Pd=bitsset(Rs,Rt)
Byte Q6_p_bitsset_RR(Word32 Rs, Word32 Rt) 574
boundscheck
Pd=boundscheck(Rs,Rtt)
Byte Q6_p_boundscheck_RP(Word32 Rs, Word64 Rtt) 567
brev
Rd=brev(Rs)
Word32 Q6_R_brev_R(Word32 Rs) 421
Rdd=brev(Rss)
Word64 Q6_P_brev_P(Word64 Rss) 421
C
cl0
Rd=cl0(Rs)
Word32 Q6_R_cl0_R(Word32 Rs) 410
Rd=cl0(Rss)
Word32 Q6_R_cl0_P(Word64 Rss) 410
cl1
Rd=cl1(Rs)
Word32 Q6_R_cl1_R(Word32 Rs) 410
Rd=cl1(Rss)
Word32 Q6_R_cl1_P(Word64 Rss) 410
clb
Rd=add(clb(Rs),#s6)
clip
Rd=clip(Rs,#u5)
Word32 Q6_R_clip_RI(Word32 Rs, Word32 Iu5) 342
clrbit
Rd=clrbit(Rs,#u5)
Word32 Q6_R_clrbit_RI(Word32 Rs, Word32 Iu5) 422
Rd=clrbit(Rs,Rt)
Word32 Q6_R_clrbit_RR(Word32 Rs, Word32 Rt) 422
cmp.eq
Pd=!cmp.eq(Rs,#s10)
Byte Q6_p_not_cmp_eq_RI(Word32 Rs, Word32 Is10) 191
Pd=!cmp.eq(Rs,Rt)
Byte Q6_p_not_cmp_eq_RR(Word32 Rs, Word32 Rt) 191
Pd=cmp.eq(Rs,#s10)
Byte Q6_p_cmp_eq_RI(Word32 Rs, Word32 Is10) 191
Pd=cmp.eq(Rs,Rt)
Byte Q6_p_cmp_eq_RR(Word32 Rs, Word32 Rt) 191
Pd=cmp.eq(Rss,Rtt)
Byte Q6_p_cmp_eq_PP(Word64 Rss, Word64 Rtt) 573
Rd=!cmp.eq(Rs,#s8)
Word32 Q6_R_not_cmp_eq_RI(Word32 Rs, Word32 Is8) 193
Rd=!cmp.eq(Rs,Rt)
Word32 Q6_R_not_cmp_eq_RR(Word32 Rs, Word32 Rt) 193
Rd=cmp.eq(Rs,#s8)
Word32 Q6_R_cmp_eq_RI(Word32 Rs, Word32 Is8) 193
Rd=cmp.eq(Rs,Rt)
Word32 Q6_R_cmp_eq_RR(Word32 Rs, Word32 Rt) 193
cmp.ge
Pd=cmp.ge(Rs,#s8)
Byte Q6_p_cmp_ge_RI(Word32 Rs, Word32 Is8) 191
cmp.geu
Pd=cmp.geu(Rs,#u8)
Byte Q6_p_cmp_geu_RI(Word32 Rs, Word32 Iu8) 191
cmp.gt
Pd=!cmp.gt(Rs,#s10)
Byte Q6_p_not_cmp_gt_RI(Word32 Rs, Word32 Is10) 191
Pd=!cmp.gt(Rs,Rt)
Byte Q6_p_not_cmp_gt_RR(Word32 Rs, Word32 Rt) 191
Pd=cmp.gt(Rs,#s10)
Byte Q6_p_cmp_gt_RI(Word32 Rs, Word32 Is10) 191
Pd=cmp.gt(Rs,Rt)
Byte Q6_p_cmp_gt_RR(Word32 Rs, Word32 Rt) 192
Pd=cmp.gt(Rss,Rtt)
Byte Q6_p_cmp_gt_PP(Word64 Rss, Word64 Rtt) 573
cmp.gtu
Pd=!cmp.gtu(Rs,#u9)
Byte Q6_p_not_cmp_gtu_RI(Word32 Rs, Word32 Iu9) 191
Pd=!cmp.gtu(Rs,Rt)
Byte Q6_p_not_cmp_gtu_RR(Word32 Rs, Word32 Rt) 191
Pd=cmp.gtu(Rs,#u9)
Byte Q6_p_cmp_gtu_RI(Word32 Rs, Word32 Iu9) 192
Pd=cmp.gtu(Rs,Rt)
Byte Q6_p_cmp_gtu_RR(Word32 Rs, Word32 Rt) 192
Pd=cmp.gtu(Rss,Rtt)
Byte Q6_p_cmp_gtu_PP(Word64 Rss, Word64 Rtt) 573
cmp.lt
Pd=cmp.lt(Rs,Rt)
Byte Q6_p_cmp_lt_RR(Word32 Rs, Word32 Rt) 192
cmp.ltu
Pd=cmp.ltu(Rs,Rt)
Byte Q6_p_cmp_ltu_RR(Word32 Rs, Word32 Rt) 192
cmpb.eq
Pd=cmpb.eq(Rs,#u8)
Byte Q6_p_cmpb_eq_RI(Word32 Rs, Word32 Iu8) 569
Pd=cmpb.eq(Rs,Rt)
Byte Q6_p_cmpb_eq_RR(Word32 Rs, Word32 Rt) 569
cmpb.gt
Pd=cmpb.gt(Rs,#s8)
Byte Q6_p_cmpb_gt_RI(Word32 Rs, Word32 Is8) 569
Pd=cmpb.gt(Rs,Rt)
Byte Q6_p_cmpb_gt_RR(Word32 Rs, Word32 Rt) 569
cmpb.gtu
Pd=cmpb.gtu(Rs,#u7)
Byte Q6_p_cmpb_gtu_RI(Word32 Rs, Word32 Iu7) 569
Pd=cmpb.gtu(Rs,Rt)
Byte Q6_p_cmpb_gtu_RR(Word32 Rs, Word32 Rt) 569
cmph.eq
Pd=cmph.eq(Rs,#s8)
Byte Q6_p_cmph_eq_RI(Word32 Rs, Word32 Is8) 571
Pd=cmph.eq(Rs,Rt)
Byte Q6_p_cmph_eq_RR(Word32 Rs, Word32 Rt) 571
cmph.gt
Pd=cmph.gt(Rs,#s8)
Byte Q6_p_cmph_gt_RI(Word32 Rs, Word32 Is8) 571
Pd=cmph.gt(Rs,Rt)
Byte Q6_p_cmph_gt_RR(Word32 Rs, Word32 Rt) 571
cmph.gtu
Pd=cmph.gtu(Rs,#u7)
Byte Q6_p_cmph_gtu_RI(Word32 Rs, Word32 Iu7) 571
Pd=cmph.gtu(Rs,Rt)
Byte Q6_p_cmph_gtu_RR(Word32 Rs, Word32 Rt) 571
cmpy
Rd=cmpy(Rs,Rt):<<1:rnd:sat
cmpyi
Rdd=cmpyi(Rs,Rt)
Word64 Q6_P_cmpyi_RR(Word32 Rs, Word32 Rt) 438
Rxx+=cmpyi(Rs,Rt)
Word64 Q6_P_cmpyiacc_RR(Word64 Rxx, Word32 Rs, Word32 Rt) 438
cmpyiw
Rd=cmpyiw(Rss,Rtt):<<1:rnd:sat
Word32 Q6_R_cmpyiw_PP_s1_rnd_sat(Word64 Rss, Word64 Rtt) 445
Rd=cmpyiw(Rss,Rtt):<<1:sat
Word32 Q6_R_cmpyiw_PP_s1_sat(Word64 Rss, Word64 Rtt) 445
Rd=cmpyiw(Rss,Rtt*):<<1:rnd:sat
Word32 Q6_R_cmpyiw_PP_conj_s1_rnd_sat(Word64 Rss, Word64 Rtt) 445
Rd=cmpyiw(Rss,Rtt*):<<1:sat
Word32 Q6_R_cmpyiw_PP_conj_s1_sat(Word64 Rss, Word64 Rtt) 445
Rdd=cmpyiw(Rss,Rtt)
Word64 Q6_P_cmpyiw_PP(Word64 Rss, Word64 Rtt) 445
Rdd=cmpyiw(Rss,Rtt*)
Word64 Q6_P_cmpyiw_PP_conj(Word64 Rss, Word64 Rtt) 445
Rxx+=cmpyiw(Rss,Rtt)
Word64 Q6_P_cmpyiwacc_PP(Word64 Rxx, Word64 Rss, Word64 Rtt) 445
Rxx+=cmpyiw(Rss,Rtt*)
Word64 Q6_P_cmpyiwacc_PP_conj(Word64 Rxx, Word64 Rss, Word64 Rtt) 445
cmpyiwh
Rd=cmpyiwh(Rss,Rt):<<1:rnd:sat
cmpyr
Rdd=cmpyr(Rs,Rt)
Word64 Q6_P_cmpyr_RR(Word32 Rs, Word32 Rt) 438
Rxx+=cmpyr(Rs,Rt)
Word64 Q6_P_cmpyracc_RR(Word64 Rxx, Word32 Rs, Word32 Rt) 438
cmpyrw
Rd=cmpyrw(Rss,Rtt):<<1:rnd:sat
Word32 Q6_R_cmpyrw_PP_s1_rnd_sat(Word64 Rss, Word64 Rtt) 445
Rd=cmpyrw(Rss,Rtt):<<1:sat
Word32 Q6_R_cmpyrw_PP_s1_sat(Word64 Rss, Word64 Rtt) 445
Rd=cmpyrw(Rss,Rtt*):<<1:rnd:sat
Word32 Q6_R_cmpyrw_PP_conj_s1_rnd_sat(Word64 Rss, Word64 Rtt) 445
Rd=cmpyrw(Rss,Rtt*):<<1:sat
Word32 Q6_R_cmpyrw_PP_conj_s1_sat(Word64 Rss, Word64 Rtt) 445
Rdd=cmpyrw(Rss,Rtt)
Word64 Q6_P_cmpyrw_PP(Word64 Rss, Word64 Rtt) 445
Rdd=cmpyrw(Rss,Rtt*)
Word64 Q6_P_cmpyrw_PP_conj(Word64 Rss, Word64 Rtt) 445
Rxx+=cmpyrw(Rss,Rtt)
Word64 Q6_P_cmpyrwacc_PP(Word64 Rxx, Word64 Rss, Word64 Rtt) 445
Rxx+=cmpyrw(Rss,Rtt*)
Word64 Q6_P_cmpyrwacc_PP_conj(Word64 Rxx, Word64 Rss, Word64 Rtt) 445
cmpyrwh
Rd=cmpyrwh(Rss,Rt):<<1:rnd:sat
Word32 Q6_R_cmpyrwh_PR_s1_rnd_sat(Word64 Rss, Word32 Rt) 442
Rd=cmpyrwh(Rss,Rt*):<<1:rnd:sat
Word32 Q6_R_cmpyrwh_PR_conj_s1_rnd_sat(Word64 Rss, Word32 Rt) 442
combine
Rd=combine(Rt.H,Rs.H)
Word32 Q6_R_combine_RhRh(Word32 Rt, Word32 Rs) 173
Rd=combine(Rt.H,Rs.L)
Word32 Q6_R_combine_RhRl(Word32 Rt, Word32 Rs) 173
Rd=combine(Rt.L,Rs.H)
Word32 Q6_R_combine_RlRh(Word32 Rt, Word32 Rs) 173
Rd=combine(Rt.L,Rs.L)
Word32 Q6_R_combine_RlRl(Word32 Rt, Word32 Rs) 173
Rdd=combine(#s8,#S8)
Word64 Q6_P_combine_II(Word32 Is8, Word32 IS8) 173
Rdd=combine(#s8,Rs)
Word64 Q6_P_combine_IR(Word32 Is8, Word32 Rs) 173
Rdd=combine(Rs,#s8)
Word64 Q6_P_combine_RI(Word32 Rs, Word32 Is8) 173
Rdd=combine(Rs,Rt)
Word64 Q6_P_combine_RR(Word32 Rs, Word32 Rt) 173
convert_d2df
Rdd=convert_d2df(Rss)
Word64 Q6_P_convert_d2df_P(Word64 Rss) 467
convert_d2sf
Rd=convert_d2sf(Rss)
Word32 Q6_R_convert_d2sf_P(Word64 Rss) 467
convert_df2d
Rdd=convert_df2d(Rss)
Word64 Q6_P_convert_df2d_P(Word64 Rss) 470
Rdd=convert_df2d(Rss):chop
Word64 Q6_P_convert_df2d_P_chop(Word64 Rss) 470
convert_df2sf
Rd=convert_df2sf(Rss)
Word32 Q6_R_convert_df2sf_P(Word64 Rss) 466
convert_df2ud
Rdd=convert_df2ud(Rss)
Word64 Q6_P_convert_df2ud_P(Word64 Rss) 470
Rdd=convert_df2ud(Rss):chop
Word64 Q6_P_convert_df2ud_P_chop(Word64 Rss) 470
convert_df2uw
Rd=convert_df2uw(Rss)
Word32 Q6_R_convert_df2uw_P(Word64 Rss) 470
Rd=convert_df2uw(Rss):chop
Word32 Q6_R_convert_df2uw_P_chop(Word64 Rss) 470
convert_df2w
Rd=convert_df2w(Rss)
Word32 Q6_R_convert_df2w_P(Word64 Rss) 470
Rd=convert_df2w(Rss):chop
Word32 Q6_R_convert_df2w_P_chop(Word64 Rss) 470
convert_sf2d
Rdd=convert_sf2d(Rs)
Word64 Q6_P_convert_sf2d_R(Word32 Rs) 470
Rdd=convert_sf2d(Rs):chop
Word64 Q6_P_convert_sf2d_R_chop(Word32 Rs) 470
convert_sf2df
Rdd=convert_sf2df(Rs)
Word64 Q6_P_convert_sf2df_R(Word32 Rs) 466
convert_sf2ud
Rdd=convert_sf2ud(Rs)
Word64 Q6_P_convert_sf2ud_R(Word32 Rs) 470
Rdd=convert_sf2ud(Rs):chop
Word64 Q6_P_convert_sf2ud_R_chop(Word32 Rs) 470
convert_sf2uw
Rd=convert_sf2uw(Rs)
Word32 Q6_R_convert_sf2uw_R(Word32 Rs) 470
Rd=convert_sf2uw(Rs):chop
Word32 Q6_R_convert_sf2uw_R_chop(Word32 Rs) 470
convert_sf2w
Rd=convert_sf2w(Rs)
Word32 Q6_R_convert_sf2w_R(Word32 Rs) 470
Rd=convert_sf2w(Rs):chop
Word32 Q6_R_convert_sf2w_R_chop(Word32 Rs) 470
convert_ud2df
Rdd=convert_ud2df(Rss)
Word64 Q6_P_convert_ud2df_P(Word64 Rss) 467
convert_ud2sf
Rd=convert_ud2sf(Rss)
Word32 Q6_R_convert_ud2sf_P(Word64 Rss) 467
convert_uw2df
Rdd=convert_uw2df(Rs)
Word64 Q6_P_convert_uw2df_R(Word32 Rs) 467
convert_uw2sf
Rd=convert_uw2sf(Rs)
Word32 Q6_R_convert_uw2sf_R(Word32 Rs) 467
convert_w2df
Rdd=convert_w2df(Rs)
Word64 Q6_P_convert_w2df_R(Word32 Rs) 467
convert_w2sf
Rd=convert_w2sf(Rs)
Word32 Q6_R_convert_w2sf_R(Word32 Rs) 467
cround
Rd=cround(Rs,#u5)
Word32 Q6_R_cround_RI(Word32 Rs, Word32 Iu5) 357
Rd=cround(Rs,Rt)
Word32 Q6_R_cround_RR(Word32 Rs, Word32 Rt) 357
Rdd=cround(Rss,#u6)
Word64 Q6_P_cround_PI(Word64 Rss, Word32 Iu6) 357
Rdd=cround(Rss,Rt)
Word64 Q6_P_cround_PR(Word64 Rss, Word32 Rt) 357
ct0
Rd=ct0(Rs)
Word32 Q6_R_ct0_R(Word32 Rs) 412
Rd=ct0(Rss)
Word32 Q6_R_ct0_P(Word64 Rss) 412
ct1
Rd=ct1(Rs)
Word32 Q6_R_ct1_R(Word32 Rs) 412
Rd=ct1(Rss)
Word32 Q6_R_ct1_P(Word64 Rss) 412
D
dccleana
dccleana(Rs)
void Q6_dccleana_A(Address a) 320
dccleaninva
dccleaninva(Rs)
void Q6_dccleaninva_A(Address a) 320
dcfetch
dcfetch(Rs)
void Q6_dcfetch_A(Address a) 319
dcinva
dcinva(Rs)
void Q6_dcinva_A(Address a) 320
dczeroa
dczeroa(Rs)
void Q6_dczeroa_A(Address a) 316
deinterleave
Rdd=deinterleave(Rss)
Word64 Q6_P_deinterleave_P(Word64 Rss) 418
dfadd
Rdd=dfadd(Rss,Rtt)
Word64 Q6_P_dfadd_PP(Word64 Rss, Word64 Rtt) 461
dfclass
Pd=dfclass(Rss,#u5)
Byte Q6_p_dfclass_PI(Word64 Rss, Word32 Iu5) 462
dfcmp.eq
Pd=dfcmp.eq(Rss,Rtt)
Byte Q6_p_dfcmp_eq_PP(Word64 Rss, Word64 Rtt) 464
dfcmp.ge
Pd=dfcmp.ge(Rss,Rtt)
Byte Q6_p_dfcmp_ge_PP(Word64 Rss, Word64 Rtt) 464
dfcmp.gt
Pd=dfcmp.gt(Rss,Rtt)
Byte Q6_p_dfcmp_gt_PP(Word64 Rss, Word64 Rtt) 464
dfcmp.uo
Pd=dfcmp.uo(Rss,Rtt)
Byte Q6_p_dfcmp_uo_PP(Word64 Rss, Word64 Rtt) 464
dfmake
Rdd=dfmake(#u10):neg
Word64 Q6_P_dfmake_I_neg(Word32 Iu10) 478
Rdd=dfmake(#u10):pos
Word64 Q6_P_dfmake_I_pos(Word32 Iu10) 478
dfmax
Rdd=dfmax(Rss,Rtt)
Word64 Q6_P_dfmax_PP(Word64 Rss, Word64 Rtt) 479
dfmin
Rdd=dfmin(Rss,Rtt)
Word64 Q6_P_dfmin_PP(Word64 Rss, Word64 Rtt) 480
dfmpyfix
Rdd=dfmpyfix(Rss,Rtt)
Word64 Q6_P_dfmpyfix_PP(Word64 Rss, Word64 Rtt) 481
dfmpyhh
Rxx+=dfmpyhh(Rss,Rtt)
Word64 Q6_P_dfmpyhhacc_PP(Word64 Rxx, Word64 Rss, Word64 Rtt) 473
dfmpylh
Rxx+=dfmpylh(Rss,Rtt)
Word64 Q6_P_dfmpylhacc_PP(Word64 Rxx, Word64 Rss, Word64 Rtt) 473
dfmpyll
Rdd=dfmpyll(Rss,Rtt)
dfsub
Rdd=dfsub(Rss,Rtt)
Word64 Q6_P_dfsub_PP(Word64 Rss, Word64 Rtt) 483
dmsyncht
Rd=dmsyncht
Word32 Q6_R_dmsyncht() 329
E
extract
Rd=extract(Rs,#u5,#U5)
Word32 Q6_R_extract_RII(Word32 Rs, Word32 Iu5, Word32 IU5) 414
Rd=extract(Rs,Rtt)
Word32 Q6_R_extract_RP(Word32 Rs, Word64 Rtt) 414
Rdd=extract(Rss,#u6,#U6)
Word64 Q6_P_extract_PII(Word64 Rss, Word32 Iu6, Word32 IU6) 414
Rdd=extract(Rss,Rtt)
Word64 Q6_P_extract_PP(Word64 Rss, Word64 Rtt) 414
extractu
Rd=extractu(Rs,#u5,#U5)
Word32 Q6_R_extractu_RII(Word32 Rs, Word32 Iu5, Word32 IU5) 414
Rd=extractu(Rs,Rtt)
Word32 Q6_R_extractu_RP(Word32 Rs, Word64 Rtt) 414
Rdd=extractu(Rss,#u6,#U6)
Word64 Q6_P_extractu_PII(Word64 Rss, Word32 Iu6, Word32 IU6) 414
Rdd=extractu(Rss,Rtt)
Word64 Q6_P_extractu_PP(Word64 Rss, Word64 Rtt) 414
F
fastcorner9
Pd=!fastcorner9(Ps,Pt)
Byte Q6_p_not_fastcorner9_pp(Byte Ps, Byte Pt) 196
Pd=fastcorner9(Ps,Pt)
Byte Q6_p_fastcorner9_pp(Byte Ps, Byte Pt) 196
I
insert
Rx=insert(Rs,#u5,#U5)
Word32 Q6_R_insert_RII(Word32 Rx, Word32 Rs, Word32 Iu5, Word32 IU5) 417
Rx=insert(Rs,Rtt)
Word32 Q6_R_insert_RP(Word32 Rx, Word32 Rs, Word64 Rtt) 417
Rxx=insert(Rss,#u6,#U6)
Word64 Q6_P_insert_PII(Word64 Rxx, Word64 Rss, Word32 Iu6, Word32 IU6) 417
Rxx=insert(Rss,Rtt)
Word64 Q6_P_insert_PP(Word64 Rxx, Word64 Rss, Word64 Rtt) 417
interleave
Rdd=interleave(Rss)
Word64 Q6_P_interleave_P(Word64 Rss) 418
L
l2fetch
l2fetch(Rs,Rt)
lfs
Rdd=lfs(Rss,Rtt)
Word64 Q6_P_lfs_PP(Word64 Rss, Word64 Rtt) 419
lsl
Rd=lsl(#s6,Rt)
Word32 Q6_R_lsl_IR(Word32 Is6, Word32 Rt) 602
Rd=lsl(Rs,Rt)
Word32 Q6_R_lsl_RR(Word32 Rs, Word32 Rt) 602
Rdd=lsl(Rss,Rt)
Word64 Q6_P_lsl_PR(Word64 Rss, Word32 Rt) 602
Rx&=lsl(Rs,Rt)
Word32 Q6_R_lsland_RR(Word32 Rx, Word32 Rs, Word32 Rt) 608
Rx+=lsl(Rs,Rt)
Word32 Q6_R_lslacc_RR(Word32 Rx, Word32 Rs, Word32 Rt) 605
Rx-=lsl(Rs,Rt)
Word32 Q6_R_lslnac_RR(Word32 Rx, Word32 Rs, Word32 Rt) 605
Rx|=lsl(Rs,Rt)
Word32 Q6_R_lslor_RR(Word32 Rx, Word32 Rs, Word32 Rt) 608
Rxx&=lsl(Rss,Rt)
Word64 Q6_P_lsland_PR(Word64 Rxx, Word64 Rss, Word32 Rt) 608
Rxx^=lsl(Rss,Rt)
Word64 Q6_P_lslxacc_PR(Word64 Rxx, Word64 Rss, Word32 Rt) 608
Rxx+=lsl(Rss,Rt)
Word64 Q6_P_lslacc_PR(Word64 Rxx, Word64 Rss, Word32 Rt) 605
Rxx-=lsl(Rss,Rt)
Word64 Q6_P_lslnac_PR(Word64 Rxx, Word64 Rss, Word32 Rt) 605
Rxx|=lsl(Rss,Rt)
Word64 Q6_P_lslor_PR(Word64 Rxx, Word64 Rss, Word32 Rt) 609
lsr
Rd=lsr(Rs,#u5)
Word32 Q6_R_lsr_RI(Word32 Rs, Word32 Iu5) 589
Rd=lsr(Rs,Rt)
Word32 Q6_R_lsr_RR(Word32 Rs, Word32 Rt) 602
Rdd=lsr(Rss,#u6)
Word64 Q6_P_lsr_PI(Word64 Rss, Word32 Iu6) 590
Rdd=lsr(Rss,Rt)
Word64 Q6_P_lsr_PR(Word64 Rss, Word32 Rt) 602
Rx&=lsr(Rs,#u5)
Word32 Q6_R_lsrand_RI(Word32 Rx, Word32 Rs, Word32 Iu5) 596
Rx&=lsr(Rs,Rt)
Word32 Q6_R_lsrand_RR(Word32 Rx, Word32 Rs, Word32 Rt) 608
Rx^=lsr(Rs,#u5)
Word32 Q6_R_lsrxacc_RI(Word32 Rx, Word32 Rs, Word32 Iu5) 596
Rx+=lsr(Rs,#u5)
Word32 Q6_R_lsracc_RI(Word32 Rx, Word32 Rs, Word32 Iu5) 592
Rx+=lsr(Rs,Rt)
Word32 Q6_R_lsracc_RR(Word32 Rx, Word32 Rs, Word32 Rt) 605
Rx=add(#u8,lsr(Rx,#U5))
Word32 Q6_R_add_lsr_IRI(Word32 Iu8, Word32 Rx, Word32 IU5) 592
Rx=and(#u8,lsr(Rx,#U5))
Word32 Q6_R_and_lsr_IRI(Word32 Iu8, Word32 Rx, Word32 IU5) 596
Rx-=lsr(Rs,#u5)
M
mask
Rd=mask(#u5,#U5)
Word32 Q6_R_mask_II(Word32 Iu5, Word32 IU5) 588
Rdd=mask(Pt)
Word64 Q6_P_mask_p(Byte Pt) 575
max
Rd=max(Rs,Rt)
Word32 Q6_R_max_RR(Word32 Rs, Word32 Rt) 349
Rdd=max(Rss,Rtt)
Word64 Q6_P_max_PP(Word64 Rss, Word64 Rtt) 350
maxu
Rd=maxu(Rs,Rt)
UWord32 Q6_R_maxu_RR(Word32 Rs, Word32 Rt) 349
Rdd=maxu(Rss,Rtt)
UWord64 Q6_P_maxu_PP(Word64 Rss, Word64 Rtt) 350
memb
memb(Rx++#s4:0:circ(Mu))=Rt
void Q6_memb_IMR_circ(void** StartAddress, Word32 Is4_0, Word32 Mu, Word32
Rt, void* BaseAddress) 293
memb(Rx++I:circ(Mu))=Rt
void Q6_memb_MR_circ(void** StartAddress, Word32 Mu, Word32 Rt, void*
BaseAddress) 293
Rd=memb(Rx++#s4:0:circ(Mu))
Word32 Q6_R_memb_IM_circ(void** StartAddress, Word32 Is4_0, Word32 Mu,
void* BaseAddress) 227
Rd=memb(Rx++I:circ(Mu))
Word32 Q6_R_memb_M_circ(void** StartAddress, Word32 Mu, void* BaseAddress)
227
memd
memd(Rx++#s4:3:circ(Mu))=Rtt
void Q6_memd_IMP_circ(void** StartAddress, Word32 Is4_3, Word32 Mu, Word64
Rtt, void* BaseAddress) 289
memd(Rx++I:circ(Mu))=Rtt
void Q6_memd_MP_circ(void** StartAddress, Word32 Mu, Word64 Rtt, void*
BaseAddress) 289
Rdd=memd(Rx++#s4:3:circ(Mu))
Word32 Q6_R_memd_IM_circ(void** StartAddress, Word32 Is4_3, Word32 Mu,
void* BaseAddress) 223
Rdd=memd(Rx++I:circ(Mu))
Word32 Q6_R_memd_M_circ(void** StartAddress, Word32 Mu, void* BaseAddress)
223
memh
memh(Rx++#s4:1:circ(Mu))=Rt
void Q6_memh_IMR_circ(void** StartAddress, Word32 Is4_1, Word32 Mu, Word32
Rt, void* BaseAddress) 299
memh(Rx++#s4:1:circ(Mu))=Rt.H
void Q6_memh_IMRh_circ(void** StartAddress, Word32 Is4_1, Word32 Mu, Word32
Rt, void* BaseAddress) 299
memh(Rx++I:circ(Mu))=Rt
void Q6_memh_MR_circ(void** StartAddress, Word32 Mu, Word32 Rt, void*
BaseAddress) 299
memh(Rx++I:circ(Mu))=Rt.H
void Q6_memh_MRh_circ(void** StartAddress, Word32 Mu, Word32 Rt, void*
BaseAddress) 299
Rd=memh(Rx++#s4:1:circ(Mu))
Word32 Q6_R_memh_IM_circ(void** StartAddress, Word32 Is4_1, Word32 Mu,
void* BaseAddress) 237
Rd=memh(Rx++I:circ(Mu))
Word32 Q6_R_memh_M_circ(void** StartAddress, Word32 Mu, void* BaseAddress)
237
memub
Rd=memub(Rx++#s4:0:circ(Mu))
Word32 Q6_R_memub_IM_circ(void** StartAddress, Word32 Is4_0, Word32 Mu,
void* BaseAddress) 243
Rd=memub(Rx++I:circ(Mu))
Word32 Q6_R_memub_M_circ(void** StartAddress, Word32 Mu, void*
BaseAddress) 243
memuh
Rd=memuh(Rx++#s4:1:circ(Mu))
Word32 Q6_R_memuh_IM_circ(void** StartAddress, Word32 Is4_1, Word32 Mu,
void* BaseAddress) 247
Rd=memuh(Rx++I:circ(Mu))
Word32 Q6_R_memuh_M_circ(void** StartAddress, Word32 Mu, void*
BaseAddress) 247
memw
memw(Rx++#s4:2:circ(Mu))=Rt
void Q6_memw_IMR_circ(void** StartAddress, Word32 Is4_2, Word32 Mu, Word32
Rt, void* BaseAddress) 306
memw(Rx++I:circ(Mu))=Rt
void Q6_memw_MR_circ(void** StartAddress, Word32 Mu, Word32 Rt, void*
BaseAddress) 306
Rd=memw(Rx++#s4:2:circ(Mu))
Word32 Q6_R_memw_IM_circ(void** StartAddress, Word32 Is4_2, Word32 Mu,
void* BaseAddress) 251
Rd=memw(Rx++I:circ(Mu))
Word32 Q6_R_memw_M_circ(void** StartAddress, Word32 Mu, void* BaseAddress)
251
min
Rd=min(Rt,Rs)
Word32 Q6_R_min_RR(Word32 Rt, Word32 Rs) 351
Rdd=min(Rtt,Rss)
Word64 Q6_P_min_PP(Word64 Rtt, Word64 Rss) 352
minu
Rd=minu(Rt,Rs)
UWord32 Q6_R_minu_RR(Word32 Rt, Word32 Rs) 351
Rdd=minu(Rtt,Rss)
UWord64 Q6_P_minu_PP(Word64 Rtt, Word64 Rss) 352
modwrap
Rd=modwrap(Rs,Rt)
Word32 Q6_R_modwrap_RR(Word32 Rs, Word32 Rt) 353
mpy
Rd=mpy(Rs,Rt.H):<<1:rnd:sat
Word32 Q6_R_mpy_RRh_s1_rnd_sat(Word32 Rs, Word32 Rt) 513
Rd=mpy(Rs,Rt.H):<<1:sat
Word32 Q6_R_mpy_RRh_s1_sat(Word32 Rs, Word32 Rt) 513
Rd=mpy(Rs,Rt.L):<<1:rnd:sat
Word32 Q6_R_mpy_RRl_s1_rnd_sat(Word32 Rs, Word32 Rt) 513
Rd=mpy(Rs,Rt.L):<<1:sat
Word32 Q6_R_mpy_RRl_s1_sat(Word32 Rs, Word32 Rt) 513
Rd=mpy(Rs,Rt)
Word32 Q6_R_mpy_RR(Word32 Rs, Word32 Rt) 513
Rd=mpy(Rs,Rt):<<1
Word32 Q6_R_mpy_RR_s1(Word32 Rs, Word32 Rt) 513
Rd=mpy(Rs,Rt):<<1:sat
Word32 Q6_R_mpy_RR_s1_sat(Word32 Rs, Word32 Rt) 513
Rd=mpy(Rs,Rt):rnd
Word32 Q6_R_mpy_RR_rnd(Word32 Rs, Word32 Rt) 513
Rd=mpy(Rs.H,Rt.H)
Word32 Q6_R_mpy_RhRh(Word32 Rs, Word32 Rt) 497
Rd=mpy(Rs.H,Rt.H):<<1
Word32 Q6_R_mpy_RhRh_s1(Word32 Rs, Word32 Rt) 497
Rd=mpy(Rs.H,Rt.H):<<1:rnd
Word32 Q6_R_mpy_RhRh_s1_rnd(Word32 Rs, Word32 Rt) 497
Rd=mpy(Rs.H,Rt.H):<<1:rnd:sat
Word32 Q6_R_mpy_RhRh_s1_rnd_sat(Word32 Rs, Word32 Rt) 497
Rd=mpy(Rs.H,Rt.H):<<1:sat
Word32 Q6_R_mpy_RhRh_s1_sat(Word32 Rs, Word32 Rt) 497
Rd=mpy(Rs.H,Rt.H):rnd
mpyi
Rd=mpyi(Rs,#m9)
Word32 Q6_R_mpyi_RI(Word32 Rs, Word32 Im9) 486
Rd=mpyi(Rs,Rt)
Word32 Q6_R_mpyi_RR(Word32 Rs, Word32 Rt) 486
Rx+=mpyi(Rs,#u8)
Word32 Q6_R_mpyiacc_RI(Word32 Rx, Word32 Rs, Word32 Iu8) 486
Rx+=mpyi(Rs,Rt)
Word32 Q6_R_mpyiacc_RR(Word32 Rx, Word32 Rs, Word32 Rt) 486
Rx-=mpyi(Rs,#u8)
Word32 Q6_R_mpyinac_RI(Word32 Rx, Word32 Rs, Word32 Iu8) 486
Rx-=mpyi(Rs,Rt)
Word32 Q6_R_mpyinac_RR(Word32 Rx, Word32 Rs, Word32 Rt) 486
mpysu
Rd=mpysu(Rs,Rt)
Word32 Q6_R_mpysu_RR(Word32 Rs, Word32 Rt) 513
mpyu
Rd=mpyu(Rs,Rt)
UWord32 Q6_R_mpyu_RR(Word32 Rs, Word32 Rt) 513
Rd=mpyu(Rs.H,Rt.H)
UWord32 Q6_R_mpyu_RhRh(Word32 Rs, Word32 Rt) 504
Rd=mpyu(Rs.H,Rt.H):<<1
UWord32 Q6_R_mpyu_RhRh_s1(Word32 Rs, Word32 Rt) 504
Rd=mpyu(Rs.H,Rt.L)
UWord32 Q6_R_mpyu_RhRl(Word32 Rs, Word32 Rt) 504
Rd=mpyu(Rs.H,Rt.L):<<1
UWord32 Q6_R_mpyu_RhRl_s1(Word32 Rs, Word32 Rt) 504
Rd=mpyu(Rs.L,Rt.H)
UWord32 Q6_R_mpyu_RlRh(Word32 Rs, Word32 Rt) 504
Rd=mpyu(Rs.L,Rt.H):<<1
UWord32 Q6_R_mpyu_RlRh_s1(Word32 Rs, Word32 Rt) 504
Rd=mpyu(Rs.L,Rt.L)
UWord32 Q6_R_mpyu_RlRl(Word32 Rs, Word32 Rt) 504
Rd=mpyu(Rs.L,Rt.L):<<1
UWord32 Q6_R_mpyu_RlRl_s1(Word32 Rs, Word32 Rt) 504
Rdd=mpyu(Rs,Rt)
UWord64 Q6_P_mpyu_RR(Word32 Rs, Word32 Rt) 515
Rdd=mpyu(Rs.H,Rt.H)
mpyui
Rd=mpyui(Rs,Rt)
Word32 Q6_R_mpyui_RR(Word32 Rs, Word32 Rt) 486
mux
Rd=mux(Pu,#s8,#S8)
Word32 Q6_R_mux_pII(Byte Pu, Word32 Is8, Word32 IS8) 174
Rd=mux(Pu,#s8,Rs)
Word32 Q6_R_mux_pIR(Byte Pu, Word32 Is8, Word32 Rs) 174
Rd=mux(Pu,Rs,#s8)
Word32 Q6_R_mux_pRI(Byte Pu, Word32 Rs, Word32 Is8) 174
Rd=mux(Pu,Rs,Rt)
Word32 Q6_R_mux_pRR(Byte Pu, Word32 Rs, Word32 Rt) 174
N
neg
Rd=neg(Rs)
Word32 Q6_R_neg_R(Word32 Rs) 160
Rd=neg(Rs):sat
Word32 Q6_R_neg_R_sat(Word32 Rs) 354
Rdd=neg(Rss)
Word64 Q6_P_neg_P(Word64 Rss) 354
no mnemonic
Pd=Ps
Byte Q6_p_equals_p(Byte Ps) 203
Pd=Rs
Byte Q6_p_equals_R(Word32 Rs) 577
Rd=#s16
Word32 Q6_R_equals_I(Word32 Is16) 165
Rd=Ps
normamt
Rd=normamt(Rs)
Word32 Q6_R_normamt_R(Word32 Rs) 410
Rd=normamt(Rss)
Word32 Q6_R_normamt_P(Word64 Rss) 410
not
Pd=not(Ps)
Byte Q6_p_not_p(Byte Ps) 203
Rd=not(Rs)
Word32 Q6_R_not_R(Word32 Rs) 158
Rdd=not(Rss)
Word64 Q6_P_not_P(Word64 Rss) 343
O
or
Pd=and(Ps,or(Pt,!Pu))
Byte Q6_p_and_or_ppnp(Byte Ps, Byte Pt, Byte Pu) 203
Pd=and(Ps,or(Pt,Pu))
Byte Q6_p_and_or_ppp(Byte Ps, Byte Pt, Byte Pu) 203
Pd=or(Ps,and(Pt,!Pu))
Byte Q6_p_or_and_ppnp(Byte Ps, Byte Pt, Byte Pu) 203
Pd=or(Ps,and(Pt,Pu))
Byte Q6_p_or_and_ppp(Byte Ps, Byte Pt, Byte Pu) 203
Pd=or(Ps,or(Pt,!Pu))
Byte Q6_p_or_or_ppnp(Byte Ps, Byte Pt, Byte Pu) 204
Pd=or(Ps,or(Pt,Pu))
Byte Q6_p_or_or_ppp(Byte Ps, Byte Pt, Byte Pu) 204
Pd=or(Pt,!Ps)
Byte Q6_p_or_pnp(Byte Pt, Byte Ps) 204
Pd=or(Pt,Ps)
Byte Q6_p_or_pp(Byte Pt, Byte Ps) 204
Rd=or(Rs,#s10)
Word32 Q6_R_or_RI(Word32 Rs, Word32 Is10) 158
Rd=or(Rs,Rt)
Word32 Q6_R_or_RR(Word32 Rs, Word32 Rt) 158
Rd=or(Rt,~Rs)
Word32 Q6_R_or_RnR(Word32 Rt, Word32 Rs) 158
Rdd=or(Rss,Rtt)
Word64 Q6_P_or_PP(Word64 Rss, Word64 Rtt) 343
Rdd=or(Rtt,~Rss)
Word64 Q6_P_or_PnP(Word64 Rtt, Word64 Rss) 343
Rx&=or(Rs,Rt)
Word32 Q6_R_orand_RR(Word32 Rx, Word32 Rs, Word32 Rt) 347
Rx^=or(Rs,Rt)
P
packhl
Rdd=packhl(Rs,Rt)
Word64 Q6_P_packhl_RR(Word32 Rs, Word32 Rt) 177
parity
Rd=parity(Rs,Rt)
Word32 Q6_R_parity_RR(Word32 Rs, Word32 Rt) 420
Rd=parity(Rss,Rtt)
Word32 Q6_R_parity_PP(Word64 Rss, Word64 Rtt) 420
pmpyw
Rdd=pmpyw(Rs,Rt)
Word64 Q6_P_pmpyw_RR(Word32 Rs, Word32 Rt) 509
Rxx^=pmpyw(Rs,Rt)
Word64 Q6_P_pmpywxacc_RR(Word64 Rxx, Word32 Rs, Word32 Rt) 509
popcount
Rd=popcount(Rss)
Word32 Q6_R_popcount_P(Word64 Rss) 411
R
rol
Rd=rol(Rs,#u5)
Word32 Q6_R_rol_RI(Word32 Rs, Word32 Iu5) 589
Rdd=rol(Rss,#u6)
Word64 Q6_P_rol_PI(Word64 Rss, Word32 Iu6) 590
Rx&=rol(Rs,#u5)
Word32 Q6_R_roland_RI(Word32 Rx, Word32 Rs, Word32 Iu5) 596
Rx^=rol(Rs,#u5)
Word32 Q6_R_rolxacc_RI(Word32 Rx, Word32 Rs, Word32 Iu5) 596
Rx+=rol(Rs,#u5)
Word32 Q6_R_rolacc_RI(Word32 Rx, Word32 Rs, Word32 Iu5) 592
Rx-=rol(Rs,#u5)
Word32 Q6_R_rolnac_RI(Word32 Rx, Word32 Rs, Word32 Iu5) 592
Rx|=rol(Rs,#u5)
Word32 Q6_R_rolor_RI(Word32 Rx, Word32 Rs, Word32 Iu5) 596
Rxx&=rol(Rss,#u6)
Word64 Q6_P_roland_PI(Word64 Rxx, Word64 Rss, Word32 Iu6) 596
Rxx^=rol(Rss,#u6)
Word64 Q6_P_rolxacc_PI(Word64 Rxx, Word64 Rss, Word32 Iu6) 597
Rxx+=rol(Rss,#u6)
Word64 Q6_P_rolacc_PI(Word64 Rxx, Word64 Rss, Word32 Iu6) 592
Rxx-=rol(Rss,#u6)
Word64 Q6_P_rolnac_PI(Word64 Rxx, Word64 Rss, Word32 Iu6) 592
Rxx|=rol(Rss,#u6)
Word64 Q6_P_rolor_PI(Word64 Rxx, Word64 Rss, Word32 Iu6) 597
round
Rd=round(Rs,#u5)
S
sat
Rd=sat(Rss)
Word32 Q6_R_sat_P(Word64 Rss) 542
satb
Rd=satb(Rs)
Word32 Q6_R_satb_R(Word32 Rs) 542
sath
Rd=sath(Rs)
Word32 Q6_R_sath_R(Word32 Rs) 542
satub
Rd=satub(Rs)
Word32 Q6_R_satub_R(Word32 Rs) 542
satuh
Rd=satuh(Rs)
Word32 Q6_R_satuh_R(Word32 Rs) 542
setbit
Rd=setbit(Rs,#u5)
Word32 Q6_R_setbit_RI(Word32 Rs, Word32 Iu5) 422
Rd=setbit(Rs,Rt)
Word32 Q6_R_setbit_RR(Word32 Rs, Word32 Rt) 422
sfadd
Rd=sfadd(Rs,Rt)
Word32 Q6_R_sfadd_RR(Word32 Rs, Word32 Rt) 461
sfclass
Pd=sfclass(Rs,#u5)
Byte Q6_p_sfclass_RI(Word32 Rs, Word32 Iu5) 462
sfcmp.eq
Pd=sfcmp.eq(Rs,Rt)
Byte Q6_p_sfcmp_eq_RR(Word32 Rs, Word32 Rt) 464
sfcmp.ge
Pd=sfcmp.ge(Rs,Rt)
Byte Q6_p_sfcmp_ge_RR(Word32 Rs, Word32 Rt) 464
sfcmp.gt
Pd=sfcmp.gt(Rs,Rt)
Byte Q6_p_sfcmp_gt_RR(Word32 Rs, Word32 Rt) 464
sfcmp.uo
Pd=sfcmp.uo(Rs,Rt)
sffixupd
Rd=sffixupd(Rs,Rt)
Word32 Q6_R_sffixupd_RR(Word32 Rs, Word32 Rt) 472
sffixupn
Rd=sffixupn(Rs,Rt)
Word32 Q6_R_sffixupn_RR(Word32 Rs, Word32 Rt) 472
sffixupr
Rd=sffixupr(Rs)
Word32 Q6_R_sffixupr_R(Word32 Rs) 472
sfmake
Rd=sfmake(#u10):neg
Word32 Q6_R_sfmake_I_neg(Word32 Iu10) 478
Rd=sfmake(#u10):pos
Word32 Q6_R_sfmake_I_pos(Word32 Iu10) 478
sfmax
Rd=sfmax(Rs,Rt)
Word32 Q6_R_sfmax_RR(Word32 Rs, Word32 Rt) 479
sfmin
Rd=sfmin(Rs,Rt)
Word32 Q6_R_sfmin_RR(Word32 Rs, Word32 Rt) 480
sfmpy
Rd=sfmpy(Rs,Rt)
Word32 Q6_R_sfmpy_RR(Word32 Rs, Word32 Rt) 481
Rx+=sfmpy(Rs,Rt,Pu):scale
Word32 Q6_R_sfmpyacc_RRp_scale(Word32 Rx, Word32 Rs, Word32 Rt, Byte Pu) 474
Rx+=sfmpy(Rs,Rt)
Word32 Q6_R_sfmpyacc_RR(Word32 Rx, Word32 Rs, Word32 Rt) 473
Rx+=sfmpy(Rs,Rt):lib
Word32 Q6_R_sfmpyacc_RR_lib(Word32 Rx, Word32 Rs, Word32 Rt) 476
Rx-=sfmpy(Rs,Rt)
Word32 Q6_R_sfmpynac_RR(Word32 Rx, Word32 Rs, Word32 Rt) 473
Rx-=sfmpy(Rs,Rt):lib
Word32 Q6_R_sfmpynac_RR_lib(Word32 Rx, Word32 Rs, Word32 Rt) 476
sfsub
Rd=sfsub(Rs,Rt)
Word32 Q6_R_sfsub_RR(Word32 Rs, Word32 Rt) 483
shuffeb
Rdd=shuffeb(Rss,Rtt)
Word64 Q6_P_shuffeb_PP(Word64 Rss, Word64 Rtt) 556
shuffeh
Rdd=shuffeh(Rss,Rtt)
Word64 Q6_P_shuffeh_PP(Word64 Rss, Word64 Rtt) 556
shuffob
Rdd=shuffob(Rtt,Rss)
Word64 Q6_P_shuffob_PP(Word64 Rtt, Word64 Rss) 556
shuffoh
Rdd=shuffoh(Rtt,Rss)
sub
Rd=add(Rs,sub(#s6,Ru))
Word32 Q6_R_add_sub_RIR(Word32 Rs, Word32 Is6, Word32 Ru) 335
Rd=sub(#s10,Rs)
Word32 Q6_R_sub_IR(Word32 Is10, Word32 Rs) 162
Rd=sub(Rt,Rs)
Word32 Q6_R_sub_RR(Word32 Rt, Word32 Rs) 162
Rd=sub(Rt,Rs):sat
Word32 Q6_R_sub_RR_sat(Word32 Rt, Word32 Rs) 162
Rd=sub(Rt.H,Rs.H):<<16
Word32 Q6_R_sub_RhRh_s16(Word32 Rt, Word32 Rs) 361
Rd=sub(Rt.H,Rs.H):sat:<<16
Word32 Q6_R_sub_RhRh_sat_s16(Word32 Rt, Word32 Rs) 361
Rd=sub(Rt.H,Rs.L):<<16
Word32 Q6_R_sub_RhRl_s16(Word32 Rt, Word32 Rs) 361
Rd=sub(Rt.H,Rs.L):sat:<<16
Word32 Q6_R_sub_RhRl_sat_s16(Word32 Rt, Word32 Rs) 361
Rd=sub(Rt.L,Rs.H)
Word32 Q6_R_sub_RlRh(Word32 Rt, Word32 Rs) 361
Rd=sub(Rt.L,Rs.H):<<16
Word32 Q6_R_sub_RlRh_s16(Word32 Rt, Word32 Rs) 361
Rd=sub(Rt.L,Rs.H):sat
Word32 Q6_R_sub_RlRh_sat(Word32 Rt, Word32 Rs) 361
Rd=sub(Rt.L,Rs.H):sat:<<16
Word32 Q6_R_sub_RlRh_sat_s16(Word32 Rt, Word32 Rs) 361
Rd=sub(Rt.L,Rs.L)
Word32 Q6_R_sub_RlRl(Word32 Rt, Word32 Rs) 361
Rd=sub(Rt.L,Rs.L):<<16
Word32 Q6_R_sub_RlRl_s16(Word32 Rt, Word32 Rs) 361
Rd=sub(Rt.L,Rs.L):sat
Word32 Q6_R_sub_RlRl_sat(Word32 Rt, Word32 Rs) 361
Rd=sub(Rt.L,Rs.L):sat:<<16
Word32 Q6_R_sub_RlRl_sat_s16(Word32 Rt, Word32 Rs) 361
Rdd=sub(Rtt,Rss)
Word64 Q6_P_sub_PP(Word64 Rtt, Word64 Rss) 358
Rx+=sub(Rt,Rs)
Word32 Q6_R_subacc_RR(Word32 Rx, Word32 Rt, Word32 Rs) 359
swiz
Rd=swiz(Rs)
Word32 Q6_R_swiz_R(Word32 Rs) 544
sxtb
Rd=sxtb(Rs)
Word32 Q6_R_sxtb_R(Word32 Rs) 164
sxth
Rd=sxth(Rs)
Word32 Q6_R_sxth_R(Word32 Rs) 164
sxtw
Rdd=sxtw(Rs)
Word64 Q6_P_sxtw_R(Word32 Rs) 362
T
tableidxb
Rx=tableidxb(Rs,#u4,#U5)
Word32 Q6_R_tableidxb_RII(Word32 Rx, Word32 Rs, Word32 Iu4, Word32 IU5) 427
tableidxd
Rx=tableidxd(Rs,#u4,#U5)
Word32 Q6_R_tableidxd_RII(Word32 Rx, Word32 Rs, Word32 Iu4, Word32 IU5) 427
tableidxh
Rx=tableidxh(Rs,#u4,#U5)
Word32 Q6_R_tableidxh_RII(Word32 Rx, Word32 Rs, Word32 Iu4, Word32 IU5) 427
tableidxw
Rx=tableidxw(Rs,#u4,#U5)
Word32 Q6_R_tableidxw_RII(Word32 Rx, Word32 Rs, Word32 Iu4, Word32 IU5) 427
tlbmatch
Pd=tlbmatch(Rss,Rt)
Byte Q6_p_tlbmatch_PR(Word64 Rss, Word32 Rt) 576
togglebit
Rd=togglebit(Rs,#u5)
Word32 Q6_R_togglebit_RI(Word32 Rs, Word32 Iu5) 422
Rd=togglebit(Rs,Rt)
Word32 Q6_R_togglebit_RR(Word32 Rs, Word32 Rt) 422
tstbit
Pd=!tstbit(Rs,#u5)
Byte Q6_p_not_tstbit_RI(Word32 Rs, Word32 Iu5) 578
Pd=!tstbit(Rs,Rt)
Byte Q6_p_not_tstbit_RR(Word32 Rs, Word32 Rt) 578
Pd=tstbit(Rs,#u5)
Byte Q6_p_tstbit_RI(Word32 Rs, Word32 Iu5) 578
Pd=tstbit(Rs,Rt)
Byte Q6_p_tstbit_RR(Word32 Rs, Word32 Rt) 578
V
vabsdiffb
Rdd=vabsdiffb(Rtt,Rss)
Word64 Q6_P_vabsdiffb_PP(Word64 Rtt, Word64 Rss) 365
vabsdiffh
Rdd=vabsdiffh(Rtt,Rss)
Word64 Q6_P_vabsdiffh_PP(Word64 Rtt, Word64 Rss) 366
vabsdiffub
Rdd=vabsdiffub(Rtt,Rss)
Word64 Q6_P_vabsdiffub_PP(Word64 Rtt, Word64 Rss) 365
vabsdiffw
Rdd=vabsdiffw(Rtt,Rss)
Word64 Q6_P_vabsdiffw_PP(Word64 Rtt, Word64 Rss) 367
vabsh
Rdd=vabsh(Rss)
Word64 Q6_P_vabsh_P(Word64 Rss) 363
Rdd=vabsh(Rss):sat
vabsw
Rdd=vabsw(Rss)
Word64 Q6_P_vabsw_P(Word64 Rss) 364
Rdd=vabsw(Rss):sat
Word64 Q6_P_vabsw_P_sat(Word64 Rss) 364
vaddb
Rdd=vaddb(Rss,Rtt)
Word64 Q6_P_vaddb_PP(Word64 Rss, Word64 Rtt) 378
vaddh
Rd=vaddh(Rs,Rt)
Word32 Q6_R_vaddh_RR(Word32 Rs, Word32 Rt) 168
Rd=vaddh(Rs,Rt):sat
Word32 Q6_R_vaddh_RR_sat(Word32 Rs, Word32 Rt) 168
Rdd=vaddh(Rss,Rtt)
Word64 Q6_P_vaddh_PP(Word64 Rss, Word64 Rtt) 371
Rdd=vaddh(Rss,Rtt):sat
Word64 Q6_P_vaddh_PP_sat(Word64 Rss, Word64 Rtt) 371
vaddhub
Rd=vaddhub(Rss,Rtt):sat
Word32 Q6_R_vaddhub_PP_sat(Word64 Rss, Word64 Rtt) 373
vaddub
Rdd=vaddub(Rss,Rtt)
Word64 Q6_P_vaddub_PP(Word64 Rss, Word64 Rtt) 378
Rdd=vaddub(Rss,Rtt):sat
Word64 Q6_P_vaddub_PP_sat(Word64 Rss, Word64 Rtt) 378
vadduh
Rd=vadduh(Rs,Rt):sat
Word32 Q6_R_vadduh_RR_sat(Word32 Rs, Word32 Rt) 168
Rdd=vadduh(Rss,Rtt):sat
Word64 Q6_P_vadduh_PP_sat(Word64 Rss, Word64 Rtt) 371
vaddw
Rdd=vaddw(Rss,Rtt)
Word64 Q6_P_vaddw_PP(Word64 Rss, Word64 Rtt) 379
Rdd=vaddw(Rss,Rtt):sat
Word64 Q6_P_vaddw_PP_sat(Word64 Rss, Word64 Rtt) 379
valignb
Rdd=valignb(Rtt,Rss,#u3)
Word64 Q6_P_valignb_PPI(Word64 Rtt, Word64 Rss, Word32 Iu3) 545
Rdd=valignb(Rtt,Rss,Pu)
Word64 Q6_P_valignb_PPp(Word64 Rtt, Word64 Rss, Byte Pu) 545
vaslh
Rdd=vaslh(Rss,#u4)
Word64 Q6_P_vaslh_PI(Word64 Rss, Word32 Iu4) 611
Rdd=vaslh(Rss,Rt)
Word64 Q6_P_vaslh_PR(Word64 Rss, Word32 Rt) 616
vaslw
Rdd=vaslw(Rss,#u5)
Word64 Q6_P_vaslw_PI(Word64 Rss, Word32 Iu5) 617
Rdd=vaslw(Rss,Rt)
vasrh
Rdd=vasrh(Rss,#u4)
Word64 Q6_P_vasrh_PI(Word64 Rss, Word32 Iu4) 611
Rdd=vasrh(Rss,#u4):rnd
Word64 Q6_P_vasrh_PI_rnd(Word64 Rss, Word32 Iu4) 612
Rdd=vasrh(Rss,Rt)
Word64 Q6_P_vasrh_PR(Word64 Rss, Word32 Rt) 616
vasrhub
Rd=vasrhub(Rss,#u4):rnd:sat
Word32 Q6_R_vasrhub_PI_rnd_sat(Word64 Rss, Word32 Iu4) 614
Rd=vasrhub(Rss,#u4):sat
Word32 Q6_R_vasrhub_PI_sat(Word64 Rss, Word32 Iu4) 614
vasrw
Rd=vasrw(Rss,#u5)
Word32 Q6_R_vasrw_PI(Word64 Rss, Word32 Iu5) 620
Rd=vasrw(Rss,Rt)
Word32 Q6_R_vasrw_PR(Word64 Rss, Word32 Rt) 620
Rdd=vasrw(Rss,#u5)
Word64 Q6_P_vasrw_PI(Word64 Rss, Word32 Iu5) 617
Rdd=vasrw(Rss,Rt)
Word64 Q6_P_vasrw_PR(Word64 Rss, Word32 Rt) 618
vavgh
Rd=vavgh(Rs,Rt)
Word32 Q6_R_vavgh_RR(Word32 Rs, Word32 Rt) 169
Rd=vavgh(Rs,Rt):rnd
Word32 Q6_R_vavgh_RR_rnd(Word32 Rs, Word32 Rt) 169
Rdd=vavgh(Rss,Rtt)
Word64 Q6_P_vavgh_PP(Word64 Rss, Word64 Rtt) 381
Rdd=vavgh(Rss,Rtt):crnd
Word64 Q6_P_vavgh_PP_crnd(Word64 Rss, Word64 Rtt) 381
Rdd=vavgh(Rss,Rtt):rnd
Word64 Q6_P_vavgh_PP_rnd(Word64 Rss, Word64 Rtt) 381
vavgub
Rdd=vavgub(Rss,Rtt)
Word64 Q6_P_vavgub_PP(Word64 Rss, Word64 Rtt) 382
Rdd=vavgub(Rss,Rtt):rnd
Word64 Q6_P_vavgub_PP_rnd(Word64 Rss, Word64 Rtt) 382
vavguh
Rdd=vavguh(Rss,Rtt)
Word64 Q6_P_vavguh_PP(Word64 Rss, Word64 Rtt) 381
Rdd=vavguh(Rss,Rtt):rnd
Word64 Q6_P_vavguh_PP_rnd(Word64 Rss, Word64 Rtt) 381
vavguw
Rdd=vavguw(Rss,Rtt)
Word64 Q6_P_vavguw_PP(Word64 Rss, Word64 Rtt) 384
Rdd=vavguw(Rss,Rtt):rnd
Word64 Q6_P_vavguw_PP_rnd(Word64 Rss, Word64 Rtt) 384
vavgw
Rdd=vavgw(Rss,Rtt)
vclip
Rdd=vclip(Rss,#u5)
Word64 Q6_P_vclip_PI(Word64 Rss, Word32 Iu5) 385
vcmpb.eq
Pd=!any8(vcmpb.eq(Rss,Rtt))
Byte Q6_p_not_any8_vcmpb_eq_PP(Word64 Rss, Word64 Rtt) 581
Pd=any8(vcmpb.eq(Rss,Rtt))
Byte Q6_p_any8_vcmpb_eq_PP(Word64 Rss, Word64 Rtt) 581
Pd=vcmpb.eq(Rss,#u8)
Byte Q6_p_vcmpb_eq_PI(Word64 Rss, Word32 Iu8) 583
Pd=vcmpb.eq(Rss,Rtt)
Byte Q6_p_vcmpb_eq_PP(Word64 Rss, Word64 Rtt) 583
vcmpb.gt
Pd=vcmpb.gt(Rss,#s8)
Byte Q6_p_vcmpb_gt_PI(Word64 Rss, Word32 Is8) 583
Pd=vcmpb.gt(Rss,Rtt)
Byte Q6_p_vcmpb_gt_PP(Word64 Rss, Word64 Rtt) 583
vcmpb.gtu
Pd=vcmpb.gtu(Rss,#u7)
Byte Q6_p_vcmpb_gtu_PI(Word64 Rss, Word32 Iu7) 583
Pd=vcmpb.gtu(Rss,Rtt)
Byte Q6_p_vcmpb_gtu_PP(Word64 Rss, Word64 Rtt) 583
vcmph.eq
Pd=vcmph.eq(Rss,#s8)
Byte Q6_p_vcmph_eq_PI(Word64 Rss, Word32 Is8) 580
Pd=vcmph.eq(Rss,Rtt)
Byte Q6_p_vcmph_eq_PP(Word64 Rss, Word64 Rtt) 580
vcmph.gt
Pd=vcmph.gt(Rss,#s8)
Byte Q6_p_vcmph_gt_PI(Word64 Rss, Word32 Is8) 580
Pd=vcmph.gt(Rss,Rtt)
Byte Q6_p_vcmph_gt_PP(Word64 Rss, Word64 Rtt) 580
vcmph.gtu
Pd=vcmph.gtu(Rss,#u7)
Byte Q6_p_vcmph_gtu_PI(Word64 Rss, Word32 Iu7) 580
Pd=vcmph.gtu(Rss,Rtt)
Byte Q6_p_vcmph_gtu_PP(Word64 Rss, Word64 Rtt) 580
vcmpw.eq
Pd=vcmpw.eq(Rss,#s8)
Byte Q6_p_vcmpw_eq_PI(Word64 Rss, Word32 Is8) 585
Pd=vcmpw.eq(Rss,Rtt)
Byte Q6_p_vcmpw_eq_PP(Word64 Rss, Word64 Rtt) 585
vcmpw.gt
Pd=vcmpw.gt(Rss,#s8)
vcmpw.gtu
Pd=vcmpw.gtu(Rss,#u7)
Byte Q6_p_vcmpw_gtu_PI(Word64 Rss, Word32 Iu7) 585
Pd=vcmpw.gtu(Rss,Rtt)
Byte Q6_p_vcmpw_gtu_PP(Word64 Rss, Word64 Rtt) 585
vcmpyi
Rdd=vcmpyi(Rss,Rtt):<<1:sat
Word64 Q6_P_vcmpyi_PP_s1_sat(Word64 Rss, Word64 Rtt) 448
Rdd=vcmpyi(Rss,Rtt):sat
Word64 Q6_P_vcmpyi_PP_sat(Word64 Rss, Word64 Rtt) 448
Rxx+=vcmpyi(Rss,Rtt):sat
Word64 Q6_P_vcmpyiacc_PP_sat(Word64 Rxx, Word64 Rss, Word64 Rtt) 448
vcmpyr
Rdd=vcmpyr(Rss,Rtt):<<1:sat
Word64 Q6_P_vcmpyr_PP_s1_sat(Word64 Rss, Word64 Rtt) 448
Rdd=vcmpyr(Rss,Rtt):sat
Word64 Q6_P_vcmpyr_PP_sat(Word64 Rss, Word64 Rtt) 448
Rxx+=vcmpyr(Rss,Rtt):sat
Word64 Q6_P_vcmpyracc_PP_sat(Word64 Rxx, Word64 Rss, Word64 Rtt) 448
vcnegh
Rdd=vcnegh(Rss,Rt)
Word64 Q6_P_vcnegh_PR(Word64 Rss, Word32 Rt) 386
vconj
Rdd=vconj(Rss):sat
Word64 Q6_P_vconj_P_sat(Word64 Rss) 450
vcrotate
Rdd=vcrotate(Rss,Rt)
Word64 Q6_P_vcrotate_PR(Word64 Rss, Word32 Rt) 452
vdmpy
Rd=vdmpy(Rss,Rtt):<<1:rnd:sat
Word32 Q6_R_vdmpy_PP_s1_rnd_sat(Word64 Rss, Word64 Rtt) 521
Rd=vdmpy(Rss,Rtt):rnd:sat
Word32 Q6_R_vdmpy_PP_rnd_sat(Word64 Rss, Word64 Rtt) 521
Rdd=vdmpy(Rss,Rtt):<<1:sat
Word64 Q6_P_vdmpy_PP_s1_sat(Word64 Rss, Word64 Rtt) 518
Rdd=vdmpy(Rss,Rtt):sat
Word64 Q6_P_vdmpy_PP_sat(Word64 Rss, Word64 Rtt) 518
Rxx+=vdmpy(Rss,Rtt):<<1:sat
Word64 Q6_P_vdmpyacc_PP_s1_sat(Word64 Rxx, Word64 Rss, Word64 Rtt) 518
Rxx+=vdmpy(Rss,Rtt):sat
Word64 Q6_P_vdmpyacc_PP_sat(Word64 Rxx, Word64 Rss, Word64 Rtt) 518
vdmpybsu
Rdd=vdmpybsu(Rss,Rtt):sat
Word64 Q6_P_vdmpybsu_PP_sat(Word64 Rss, Word64 Rtt) 525
Rxx+=vdmpybsu(Rss,Rtt):sat
Word64 Q6_P_vdmpybsuacc_PP_sat(Word64 Rxx, Word64 Rss, Word64 Rtt) 525
vitpack
Rd=vitpack(Ps,Pt)
Word32 Q6_R_vitpack_pp(Byte Ps, Byte Pt) 586
vlslh
Rdd=vlslh(Rss,Rt)
Word64 Q6_P_vlslh_PR(Word64 Rss, Word32 Rt) 616
vlslw
Rdd=vlslw(Rss,Rt)
Word64 Q6_P_vlslw_PR(Word64 Rss, Word32 Rt) 618
vlsrh
Rdd=vlsrh(Rss,#u4)
Word64 Q6_P_vlsrh_PI(Word64 Rss, Word32 Iu4) 611
Rdd=vlsrh(Rss,Rt)
Word64 Q6_P_vlsrh_PR(Word64 Rss, Word32 Rt) 616
vlsrw
Rdd=vlsrw(Rss,#u5)
Word64 Q6_P_vlsrw_PI(Word64 Rss, Word32 Iu5) 617
Rdd=vlsrw(Rss,Rt)
Word64 Q6_P_vlsrw_PR(Word64 Rss, Word32 Rt) 618
vmaxb
Rdd=vmaxb(Rtt,Rss)
Word64 Q6_P_vmaxb_PP(Word64 Rtt, Word64 Rss) 388
vmaxh
Rdd=vmaxh(Rtt,Rss)
Word64 Q6_P_vmaxh_PP(Word64 Rtt, Word64 Rss) 389
vmaxub
Rdd=vmaxub(Rtt,Rss)
Word64 Q6_P_vmaxub_PP(Word64 Rtt, Word64 Rss) 388
vmaxuh
Rdd=vmaxuh(Rtt,Rss)
Word64 Q6_P_vmaxuh_PP(Word64 Rtt, Word64 Rss) 389
vmaxuw
Rdd=vmaxuw(Rtt,Rss)
Word64 Q6_P_vmaxuw_PP(Word64 Rtt, Word64 Rss) 394
vmaxw
Rdd=vmaxw(Rtt,Rss)
Word64 Q6_P_vmaxw_PP(Word64 Rtt, Word64 Rss) 394
vminb
Rdd=vminb(Rtt,Rss)
Word64 Q6_P_vminb_PP(Word64 Rtt, Word64 Rss) 395
vminh
Rdd=vminh(Rtt,Rss)
Word64 Q6_P_vminh_PP(Word64 Rtt, Word64 Rss) 397
vminub
Rdd=vminub(Rtt,Rss)
Word64 Q6_P_vminub_PP(Word64 Rtt, Word64 Rss) 395
vminuh
Rdd=vminuh(Rtt,Rss)
Word64 Q6_P_vminuh_PP(Word64 Rtt, Word64 Rss) 397
vminuw
Rdd=vminuw(Rtt,Rss)
Word64 Q6_P_vminuw_PP(Word64 Rtt, Word64 Rss) 402
vminw
Rdd=vminw(Rtt,Rss)
Word64 Q6_P_vminw_PP(Word64 Rtt, Word64 Rss) 402
vmpybsu
Rdd=vmpybsu(Rs,Rt)
Word64 Q6_P_vmpybsu_RR(Word32 Rs, Word32 Rt) 537
Rxx+=vmpybsu(Rs,Rt)
Word64 Q6_P_vmpybsuacc_RR(Word64 Rxx, Word32 Rs, Word32 Rt) 537
vmpybu
Rdd=vmpybu(Rs,Rt)
Word64 Q6_P_vmpybu_RR(Word32 Rs, Word32 Rt) 537
Rxx+=vmpybu(Rs,Rt)
Word64 Q6_P_vmpybuacc_RR(Word64 Rxx, Word32 Rs, Word32 Rt) 537
vmpyeh
Rdd=vmpyeh(Rss,Rtt):<<1:sat
Word64 Q6_P_vmpyeh_PP_s1_sat(Word64 Rss, Word64 Rtt) 527
Rdd=vmpyeh(Rss,Rtt):sat
Word64 Q6_P_vmpyeh_PP_sat(Word64 Rss, Word64 Rtt) 527
Rxx+=vmpyeh(Rss,Rtt)
Word64 Q6_P_vmpyehacc_PP(Word64 Rxx, Word64 Rss, Word64 Rtt) 527
Rxx+=vmpyeh(Rss,Rtt):<<1:sat
Word64 Q6_P_vmpyehacc_PP_s1_sat(Word64 Rxx, Word64 Rss, Word64 Rtt) 527
Rxx+=vmpyeh(Rss,Rtt):sat
Word64 Q6_P_vmpyehacc_PP_sat(Word64 Rxx, Word64 Rss, Word64 Rtt) 527
vmpyh
Rd=vmpyh(Rs,Rt):<<1:rnd:sat
Word32 Q6_R_vmpyh_RR_s1_rnd_sat(Word32 Rs, Word32 Rt) 531
Rd=vmpyh(Rs,Rt):rnd:sat
Word32 Q6_R_vmpyh_RR_rnd_sat(Word32 Rs, Word32 Rt) 531
Rdd=vmpyh(Rs,Rt):<<1:sat
Word64 Q6_P_vmpyh_RR_s1_sat(Word32 Rs, Word32 Rt) 529
Rdd=vmpyh(Rs,Rt):sat
Word64 Q6_P_vmpyh_RR_sat(Word32 Rs, Word32 Rt) 529
Rxx+=vmpyh(Rs,Rt)
Word64 Q6_P_vmpyhacc_RR(Word64 Rxx, Word32 Rs, Word32 Rt) 529
Rxx+=vmpyh(Rs,Rt):<<1:sat
Word64 Q6_P_vmpyhacc_RR_s1_sat(Word64 Rxx, Word32 Rs, Word32 Rt) 529
Rxx+=vmpyh(Rs,Rt):sat
Word64 Q6_P_vmpyhacc_RR_sat(Word64 Rxx, Word32 Rs, Word32 Rt) 529
vmpyhsu
Rdd=vmpyhsu(Rs,Rt):<<1:sat
Word64 Q6_P_vmpyhsu_RR_s1_sat(Word32 Rs, Word32 Rt) 532
Rdd=vmpyhsu(Rs,Rt):sat
Word64 Q6_P_vmpyhsu_RR_sat(Word32 Rs, Word32 Rt) 532
Rxx+=vmpyhsu(Rs,Rt):<<1:sat
vmpyweh
Rdd=vmpyweh(Rss,Rtt):<<1:rnd:sat
Word64 Q6_P_vmpyweh_PP_s1_rnd_sat(Word64 Rss, Word64 Rtt) 489
Rdd=vmpyweh(Rss,Rtt):<<1:sat
Word64 Q6_P_vmpyweh_PP_s1_sat(Word64 Rss, Word64 Rtt) 489
Rdd=vmpyweh(Rss,Rtt):rnd:sat
Word64 Q6_P_vmpyweh_PP_rnd_sat(Word64 Rss, Word64 Rtt) 489
Rdd=vmpyweh(Rss,Rtt):sat
Word64 Q6_P_vmpyweh_PP_sat(Word64 Rss, Word64 Rtt) 489
Rxx+=vmpyweh(Rss,Rtt):<<1:rnd:sat
Word64 Q6_P_vmpywehacc_PP_s1_rnd_sat(Word64 Rxx, Word64 Rss, Word64 Rtt)
490
Rxx+=vmpyweh(Rss,Rtt):<<1:sat
Word64 Q6_P_vmpywehacc_PP_s1_sat(Word64 Rxx, Word64 Rss, Word64 Rtt) 490
Rxx+=vmpyweh(Rss,Rtt):rnd:sat
Word64 Q6_P_vmpywehacc_PP_rnd_sat(Word64 Rxx, Word64 Rss, Word64 Rtt) 490
Rxx+=vmpyweh(Rss,Rtt):sat
Word64 Q6_P_vmpywehacc_PP_sat(Word64 Rxx, Word64 Rss, Word64 Rtt) 490
vmpyweuh
Rdd=vmpyweuh(Rss,Rtt):<<1:rnd:sat
Word64 Q6_P_vmpyweuh_PP_s1_rnd_sat(Word64 Rss, Word64 Rtt) 493
Rdd=vmpyweuh(Rss,Rtt):<<1:sat
Word64 Q6_P_vmpyweuh_PP_s1_sat(Word64 Rss, Word64 Rtt) 493
Rdd=vmpyweuh(Rss,Rtt):rnd:sat
Word64 Q6_P_vmpyweuh_PP_rnd_sat(Word64 Rss, Word64 Rtt) 493
Rdd=vmpyweuh(Rss,Rtt):sat
Word64 Q6_P_vmpyweuh_PP_sat(Word64 Rss, Word64 Rtt) 494
Rxx+=vmpyweuh(Rss,Rtt):<<1:rnd:sat
Word64 Q6_P_vmpyweuhacc_PP_s1_rnd_sat(Word64 Rxx, Word64 Rss, Word64 Rtt)
494
Rxx+=vmpyweuh(Rss,Rtt):<<1:sat
Word64 Q6_P_vmpyweuhacc_PP_s1_sat(Word64 Rxx, Word64 Rss, Word64 Rtt) 494
Rxx+=vmpyweuh(Rss,Rtt):rnd:sat
Word64 Q6_P_vmpyweuhacc_PP_rnd_sat(Word64 Rxx, Word64 Rss, Word64 Rtt) 494
Rxx+=vmpyweuh(Rss,Rtt):sat
Word64 Q6_P_vmpyweuhacc_PP_sat(Word64 Rxx, Word64 Rss, Word64 Rtt) 494
vmpywoh
Rdd=vmpywoh(Rss,Rtt):<<1:rnd:sat
Word64 Q6_P_vmpywoh_PP_s1_rnd_sat(Word64 Rss, Word64 Rtt) 490
Rdd=vmpywoh(Rss,Rtt):<<1:sat
Word64 Q6_P_vmpywoh_PP_s1_sat(Word64 Rss, Word64 Rtt) 490
Rdd=vmpywoh(Rss,Rtt):rnd:sat
Word64 Q6_P_vmpywoh_PP_rnd_sat(Word64 Rss, Word64 Rtt) 490
Rdd=vmpywoh(Rss,Rtt):sat
Word64 Q6_P_vmpywoh_PP_sat(Word64 Rss, Word64 Rtt) 490
Rxx+=vmpywoh(Rss,Rtt):<<1:rnd:sat
Word64 Q6_P_vmpywohacc_PP_s1_rnd_sat(Word64 Rxx, Word64 Rss, Word64 Rtt)
490
Rxx+=vmpywoh(Rss,Rtt):<<1:sat
Word64 Q6_P_vmpywohacc_PP_s1_sat(Word64 Rxx, Word64 Rss, Word64 Rtt) 490
Rxx+=vmpywoh(Rss,Rtt):rnd:sat
Word64 Q6_P_vmpywohacc_PP_rnd_sat(Word64 Rxx, Word64 Rss, Word64 Rtt) 490
Rxx+=vmpywoh(Rss,Rtt):sat
vmpywouh
Rdd=vmpywouh(Rss,Rtt):<<1:rnd:sat
Word64 Q6_P_vmpywouh_PP_s1_rnd_sat(Word64 Rss, Word64 Rtt) 494
Rdd=vmpywouh(Rss,Rtt):<<1:sat
Word64 Q6_P_vmpywouh_PP_s1_sat(Word64 Rss, Word64 Rtt) 494
Rdd=vmpywouh(Rss,Rtt):rnd:sat
Word64 Q6_P_vmpywouh_PP_rnd_sat(Word64 Rss, Word64 Rtt) 494
Rdd=vmpywouh(Rss,Rtt):sat
Word64 Q6_P_vmpywouh_PP_sat(Word64 Rss, Word64 Rtt) 494
Rxx+=vmpywouh(Rss,Rtt):<<1:rnd:sat
Word64 Q6_P_vmpywouhacc_PP_s1_rnd_sat(Word64 Rxx, Word64 Rss, Word64 Rtt)
494
Rxx+=vmpywouh(Rss,Rtt):<<1:sat
Word64 Q6_P_vmpywouhacc_PP_s1_sat(Word64 Rxx, Word64 Rss, Word64 Rtt) 494
Rxx+=vmpywouh(Rss,Rtt):rnd:sat
Word64 Q6_P_vmpywouhacc_PP_rnd_sat(Word64 Rxx, Word64 Rss, Word64 Rtt) 494
Rxx+=vmpywouh(Rss,Rtt):sat
Word64 Q6_P_vmpywouhacc_PP_sat(Word64 Rxx, Word64 Rss, Word64 Rtt) 494
vmux
Rdd=vmux(Pu,Rss,Rtt)
Word64 Q6_P_vmux_pPP(Byte Pu, Word64 Rss, Word64 Rtt) 587
vnavgh
Rd=vnavgh(Rt,Rs)
Word32 Q6_R_vnavgh_RR(Word32 Rt, Word32 Rs) 169
Rdd=vnavgh(Rtt,Rss)
Word64 Q6_P_vnavgh_PP(Word64 Rtt, Word64 Rss) 381
Rdd=vnavgh(Rtt,Rss):crnd:sat
Word64 Q6_P_vnavgh_PP_crnd_sat(Word64 Rtt, Word64 Rss) 381
Rdd=vnavgh(Rtt,Rss):rnd:sat
Word64 Q6_P_vnavgh_PP_rnd_sat(Word64 Rtt, Word64 Rss) 381
vnavgw
Rdd=vnavgw(Rtt,Rss)
Word64 Q6_P_vnavgw_PP(Word64 Rtt, Word64 Rss) 384
Rdd=vnavgw(Rtt,Rss):crnd:sat
Word64 Q6_P_vnavgw_PP_crnd_sat(Word64 Rtt, Word64 Rss) 384
Rdd=vnavgw(Rtt,Rss):rnd:sat
Word64 Q6_P_vnavgw_PP_rnd_sat(Word64 Rtt, Word64 Rss) 384
vpmpyh
Rdd=vpmpyh(Rs,Rt)
Word64 Q6_P_vpmpyh_RR(Word32 Rs, Word32 Rt) 539
Rxx^=vpmpyh(Rs,Rt)
Word64 Q6_P_vpmpyhxacc_RR(Word64 Rxx, Word32 Rs, Word32 Rt) 539
vraddh
Rd=vraddh(Rss,Rtt)
Word32 Q6_R_vraddh_PP(Word64 Rss, Word64 Rtt) 376
vraddub
Rdd=vraddub(Rss,Rtt)
Word64 Q6_P_vraddub_PP(Word64 Rss, Word64 Rtt) 374
Rxx+=vraddub(Rss,Rtt)
Word64 Q6_P_vraddubacc_PP(Word64 Rxx, Word64 Rss, Word64 Rtt) 374
vradduh
Rd=vradduh(Rss,Rtt)
Word32 Q6_R_vradduh_PP(Word64 Rss, Word64 Rtt) 376
vrcmpys
Rd=vrcmpys(Rss,Rt):<<1:rnd:sat
Word32 Q6_R_vrcmpys_PR_s1_rnd_sat(Word64 Rss, Word32 Rt) 457
Rdd=vrcmpys(Rss,Rt):<<1:sat
Word64 Q6_P_vrcmpys_PR_s1_sat(Word64 Rss, Word32 Rt) 454
Rxx+=vrcmpys(Rss,Rt):<<1:sat
Word64 Q6_P_vrcmpysacc_PR_s1_sat(Word64 Rxx, Word64 Rss, Word32 Rt) 454
vrcnegh
Rxx+=vrcnegh(Rss,Rt)
Word64 Q6_P_vrcneghacc_PR(Word64 Rxx, Word64 Rss, Word32 Rt) 386
vrcrotate
Rdd=vrcrotate(Rss,Rt,#u2)
Word64 Q6_P_vrcrotate_PRI(Word64 Rss, Word32 Rt, Word32 Iu2) 460
Rxx+=vrcrotate(Rss,Rt,#u2)
Word64 Q6_P_vrcrotateacc_PRI(Word64 Rxx, Word64 Rss, Word32 Rt, Word32 Iu2)
460
vrmaxh
Rxx=vrmaxh(Rss,Ru)
Word64 Q6_P_vrmaxh_PR(Word64 Rxx, Word64 Rss, Word32 Ru) 390
vrmaxuh
Rxx=vrmaxuh(Rss,Ru)
Word64 Q6_P_vrmaxuh_PR(Word64 Rxx, Word64 Rss, Word32 Ru) 390
vrmaxuw
Rxx=vrmaxuw(Rss,Ru)
Word64 Q6_P_vrmaxuw_PR(Word64 Rxx, Word64 Rss, Word32 Ru) 392
vrmaxw
Rxx=vrmaxw(Rss,Ru)
Word64 Q6_P_vrmaxw_PR(Word64 Rxx, Word64 Rss, Word32 Ru) 392
vrminh
Rxx=vrminh(Rss,Ru)
Word64 Q6_P_vrminh_PR(Word64 Rxx, Word64 Rss, Word32 Ru) 398
vrminuh
Rxx=vrminuh(Rss,Ru)
Word64 Q6_P_vrminuh_PR(Word64 Rxx, Word64 Rss, Word32 Ru) 398
vrminuw
Rxx=vrminuw(Rss,Ru)
Word64 Q6_P_vrminuw_PR(Word64 Rxx, Word64 Rss, Word32 Ru) 400
vrminw
Rxx=vrminw(Rss,Ru)
Word64 Q6_P_vrminw_PR(Word64 Rxx, Word64 Rss, Word32 Ru) 400
vrmpybsu
Rdd=vrmpybsu(Rss,Rtt)
Word64 Q6_P_vrmpybsu_PP(Word64 Rss, Word64 Rtt) 523
Rxx+=vrmpybsu(Rss,Rtt)
Word64 Q6_P_vrmpybsuacc_PP(Word64 Rxx, Word64 Rss, Word64 Rtt) 523
vrmpybu
Rdd=vrmpybu(Rss,Rtt)
Word64 Q6_P_vrmpybu_PP(Word64 Rss, Word64 Rtt) 523
Rxx+=vrmpybu(Rss,Rtt)
Word64 Q6_P_vrmpybuacc_PP(Word64 Rxx, Word64 Rss, Word64 Rtt) 523
vrmpyh
Rdd=vrmpyh(Rss,Rtt)
Word64 Q6_P_vrmpyh_PP(Word64 Rss, Word64 Rtt) 534
Rxx+=vrmpyh(Rss,Rtt)
Word64 Q6_P_vrmpyhacc_PP(Word64 Rxx, Word64 Rss, Word64 Rtt) 534
vrmpyweh
Rdd=vrmpyweh(Rss,Rtt)
Word64 Q6_P_vrmpyweh_PP(Word64 Rss, Word64 Rtt) 511
Rdd=vrmpyweh(Rss,Rtt):<<1
Word64 Q6_P_vrmpyweh_PP_s1(Word64 Rss, Word64 Rtt) 511
Rxx+=vrmpyweh(Rss,Rtt)
Word64 Q6_P_vrmpywehacc_PP(Word64 Rxx, Word64 Rss, Word64 Rtt) 511
Rxx+=vrmpyweh(Rss,Rtt):<<1
Word64 Q6_P_vrmpywehacc_PP_s1(Word64 Rxx, Word64 Rss, Word64 Rtt) 511
vrmpywoh
Rdd=vrmpywoh(Rss,Rtt)
Word64 Q6_P_vrmpywoh_PP(Word64 Rss, Word64 Rtt) 511
Rdd=vrmpywoh(Rss,Rtt):<<1
Word64 Q6_P_vrmpywoh_PP_s1(Word64 Rss, Word64 Rtt) 511
Rxx+=vrmpywoh(Rss,Rtt)
Word64 Q6_P_vrmpywohacc_PP(Word64 Rxx, Word64 Rss, Word64 Rtt) 511
Rxx+=vrmpywoh(Rss,Rtt):<<1
Word64 Q6_P_vrmpywohacc_PP_s1(Word64 Rxx, Word64 Rss, Word64 Rtt) 511
vrndwh
Rd=vrndwh(Rss)
Word32 Q6_R_vrndwh_P(Word64 Rss) 547
Rd=vrndwh(Rss):sat
Word32 Q6_R_vrndwh_P_sat(Word64 Rss) 547
vrsadub
Rdd=vrsadub(Rss,Rtt)
Word64 Q6_P_vrsadub_PP(Word64 Rss, Word64 Rtt) 404
Rxx+=vrsadub(Rss,Rtt)
Word64 Q6_P_vrsadubacc_PP(Word64 Rxx, Word64 Rss, Word64 Rtt) 404
vsathb
Rd=vsathb(Rs)
Word32 Q6_R_vsathb_R(Word32 Rs) 550
Rd=vsathb(Rss)
Word32 Q6_R_vsathb_P(Word64 Rss) 550
Rdd=vsathb(Rss)
Word64 Q6_P_vsathb_P(Word64 Rss) 553
vsathub
Rd=vsathub(Rs)
Word32 Q6_R_vsathub_R(Word32 Rs) 550
Rd=vsathub(Rss)
Word32 Q6_R_vsathub_P(Word64 Rss) 550
Rdd=vsathub(Rss)
Word64 Q6_P_vsathub_P(Word64 Rss) 553
vsatwh
Rd=vsatwh(Rss)
Word32 Q6_R_vsatwh_P(Word64 Rss) 550
Rdd=vsatwh(Rss)
Word64 Q6_P_vsatwh_P(Word64 Rss) 553
vsatwuh
Rd=vsatwuh(Rss)
Word32 Q6_R_vsatwuh_P(Word64 Rss) 550
Rdd=vsatwuh(Rss)
Word64 Q6_P_vsatwuh_P(Word64 Rss) 553
vsplatb
Rd=vsplatb(Rs)
Word32 Q6_R_vsplatb_R(Word32 Rs) 557
Rdd=vsplatb(Rs)
Word64 Q6_P_vsplatb_R(Word32 Rs) 557
vsplath
Rdd=vsplath(Rs)
Word64 Q6_P_vsplath_R(Word32 Rs) 558
vspliceb
Rdd=vspliceb(Rss,Rtt,#u3)
Word64 Q6_P_vspliceb_PPI(Word64 Rss, Word64 Rtt, Word32 Iu3) 559
Rdd=vspliceb(Rss,Rtt,Pu)
Word64 Q6_P_vspliceb_PPp(Word64 Rss, Word64 Rtt, Byte Pu) 559
vsubb
Rdd=vsubb(Rss,Rtt)
Word64 Q6_P_vsubb_PP(Word64 Rss, Word64 Rtt) 407
vsubh
Rd=vsubh(Rt,Rs)
Word32 Q6_R_vsubh_RR(Word32 Rt, Word32 Rs) 170
Rd=vsubh(Rt,Rs):sat
Word32 Q6_R_vsubh_RR_sat(Word32 Rt, Word32 Rs) 170
Rdd=vsubh(Rtt,Rss)
Word64 Q6_P_vsubh_PP(Word64 Rtt, Word64 Rss) 405
Rdd=vsubh(Rtt,Rss):sat
Word64 Q6_P_vsubh_PP_sat(Word64 Rtt, Word64 Rss) 405
vsubub
Rdd=vsubub(Rtt,Rss)
Word64 Q6_P_vsubub_PP(Word64 Rtt, Word64 Rss) 407
Rdd=vsubub(Rtt,Rss):sat
Word64 Q6_P_vsubub_PP_sat(Word64 Rtt, Word64 Rss) 407
vsubuh
Rd=vsubuh(Rt,Rs):sat
Word32 Q6_R_vsubuh_RR_sat(Word32 Rt, Word32 Rs) 170
Rdd=vsubuh(Rtt,Rss):sat
Word64 Q6_P_vsubuh_PP_sat(Word64 Rtt, Word64 Rss) 405
vsubw
Rdd=vsubw(Rtt,Rss)
Word64 Q6_P_vsubw_PP(Word64 Rtt, Word64 Rss) 408
Rdd=vsubw(Rtt,Rss):sat
Word64 Q6_P_vsubw_PP_sat(Word64 Rtt, Word64 Rss) 408
vsxtbh
Rdd=vsxtbh(Rs)
Word64 Q6_P_vsxtbh_R(Word32 Rs) 561
vsxthw
Rdd=vsxthw(Rs)
Word64 Q6_P_vsxthw_R(Word32 Rs) 561
vtrunehb
Rd=vtrunehb(Rss)
Word32 Q6_R_vtrunehb_P(Word64 Rss) 564
Rdd=vtrunehb(Rss,Rtt)
Word64 Q6_P_vtrunehb_PP(Word64 Rss, Word64 Rtt) 564
vtrunewh
Rdd=vtrunewh(Rss,Rtt)
Word64 Q6_P_vtrunewh_PP(Word64 Rss, Word64 Rtt) 564
vtrunohb
Rd=vtrunohb(Rss)
Word32 Q6_R_vtrunohb_P(Word64 Rss) 564
Rdd=vtrunohb(Rss,Rtt)
Word64 Q6_P_vtrunohb_PP(Word64 Rss, Word64 Rtt) 564
vtrunowh
Rdd=vtrunowh(Rss,Rtt)
Word64 Q6_P_vtrunowh_PP(Word64 Rss, Word64 Rtt) 564
vxaddsubh
Rdd=vxaddsubh(Rss,Rtt):rnd:>>1:sat
Word64 Q6_P_vxaddsubh_PP_rnd_rs1_sat(Word64 Rss, Word64 Rtt) 430
Rdd=vxaddsubh(Rss,Rtt):sat
Word64 Q6_P_vxaddsubh_PP_sat(Word64 Rss, Word64 Rtt) 430
vxaddsubw
Rdd=vxaddsubw(Rss,Rtt):sat
Word64 Q6_P_vxaddsubw_PP_sat(Word64 Rss, Word64 Rtt) 431
vxsubaddh
Rdd=vxsubaddh(Rss,Rtt):rnd:>>1:sat
Word64 Q6_P_vxsubaddh_PP_rnd_rs1_sat(Word64 Rss, Word64 Rtt) 430
Rdd=vxsubaddh(Rss,Rtt):sat
Word64 Q6_P_vxsubaddh_PP_sat(Word64 Rss, Word64 Rtt) 430
vxsubaddw
Rdd=vxsubaddw(Rss,Rtt):sat
Word64 Q6_P_vxsubaddw_PP_sat(Word64 Rss, Word64 Rtt) 431
vzxtbh
Rdd=vzxtbh(Rs)
Word64 Q6_P_vzxtbh_R(Word32 Rs) 565
vzxthw
Rdd=vzxthw(Rs)
Word64 Q6_P_vzxthw_R(Word32 Rs) 565
X
xor
Pd=xor(Ps,Pt)
Z
zxtb
Rd=zxtb(Rs)
Word32 Q6_R_zxtb_R(Word32 Rs) 171
zxth
Rd=zxth(Rs)
Word32 Q6_R_zxth_R(Word32 Rs) 171