armasm.txt Simplified Assembly Language for the Raspberry Pi The ARM processor in the RPi 1 operates on 32-bit values, and all instructions are 32 bits long. (The Thumb mode is not considered here.) Unlike many other processors the ARM only accesses memory with its LDR- and STR-type instructions. Instructions such as ADD and MOV only operate on registers. Sixteen registers are normally available. Some provide special functions. R15 (PC) program counter R14 (LR) link register - holds subroutine's return address R13 (SP) stack pointer (like Intel, SP -> last item pushed) R12 general purpose - may be altered by subroutines R11..R4 general purpose - are not altered by subroutines R0..R3 general purpose - may be altered by subroutines 1. Most ARM instructions don't update the status flags (N Z C V) unless an "S" follows their mnemonic, as with MOVS and MULS. However, CMP and TST always update status. All processors execute branch (or jump) instructions conditionally, but the ARM can execute many other instructions conditionally. This can speed up code by eliminating conditional branch instructions (which flush the instruction pipeline). For example, this routine takes the absolute value of register zero: ABS: TST R0, R0 @ set status flag (N) if negative BPL POS @ branch if R0 is positive (PLus) RSB R0, #0 @ reverse subtract R0 from 0 POS: MOV PC, LR @ return from subroutine (PC gets LR) This does the same thing without branching, and is faster: ABS: TST R0, R0 MOVPL PC, LR @ return if R0 is positive RSB R0, #0 MOV PC, LR 2. The memory access instructions, LDR and STR, only provide a relatively small range of address offsets. To access a memory location beyond +/-4095 bytes it's necessary to use an index register. (See below.) Even more limited is the ability to specify constant (immediate) values. Only eight bits are available. Larger constants require other approaches, for example: MOV R0, #255 @ R0 gets 0x000000FF LDR R1, CONST @ R1 gets 0x12345678 ... CONST: .WORD 0x12345678 ... MOV R2, #0x00000078 @ R2 gets 0x00000078 ORR R2, #0x00005600 @ 0x56 fits in eight shifted bits ORR R2, #0x00340000 @ R2 now holds 0x00345678 ORR R2, #0x12000000 @ R2 now holds 0x12345678 3. There is a MUL instruction but no integer DIV. There is a floating point divide (FDIVD). There are many unusual instructions such as those that simultaneously add and subtract four separate bytes or those that do saturated arithmetic (set overflowed results to a maximum value rather than wrapping around). Here's a list of the commonly used instructions: ADD ADC (add including carry-in) SUB SBC (carry = not borrow, unlike Intel x86) RSB RSC (reverse subtract order) AND ORR EOR bitwise logic operators (EOR = Exclusive-OR) MOV MVN (move NOT i.e. 1's complement value) CMP TST (always set status regardless of "S") LSL LSR ASR ROR RRX (rotate right through eXtend i.e. carry bit) NOP ADD{cond}{S} Rd, Rn, #255 Rd = Rn + offset ADD{cond}{S} Rd, Rn, Rm Rd = Rn + Rm ADD{cond}{S} Rd, Rn, Rm, LSL #N Rd = Rn + Rm<>Rs MUL{cond}{S} Rd, Rm, Rs Rd = Rm * Rs MLA{cond}{S} Rd, Rm, Rs, Rn Rd = Rm * Rs + Rn B{cond} label - Branch to labeled address BL{cond} label - Branch and put return address into Link register LDR LDRB STR STRB - LoaD and STore 32-bit words and 8-bit Bytes POP {R0,R1-Rn} = LDMIA - LoaD Multiple registers, Increment After PUSH {R0,R1-Rn} = STMDB - STore Multiple registers, Decrement Before LDR{cond} Rd, [Rn] Rd = [Rn] LDR{cond} Rd, [Rn, #4095] Rd = [Rn +/- offset] LDR{cond} Rd, [Rn, Rm] Rd = [Rn +/- Rm] LDR{cond} Rd, [Rn, Rm, ROR #31] Rd = [Rn + shifted +/- Rm] 4. The floating point coprocessor has a separate set of sixteen 64-bit registers. They hold double precision values in IEEE-754 format (like Intel). FADDD FSUBD FMULD FDIVD FCMPD FNEGD FLDD FSTD FCPYD FABSD FSQRTD VPOP {D0,D1-Dn} VPUSH {D0,D1-Dn} FMSTAT - sets status flags after compare (FCMPD) FADDD{cond} Dd, Dn, Dm Dd = Dn + Dm FLDD{cond} Dd, [Rn, #1024] Dd = [Rn +/- offset] 5. These optional condition codes {cond} can follow most instructions: EQ NE CS/HS CC/LO MI PL VS VC HI LS GE LT GT LE (status bits N Z C V) (E NE C /B NC/AE S NS O NO A BE GE L G LE Intel equivalents) 6. Common commands used by the gcc compiler/assembler: gcc -o myprog myprog.s .DATA @ start of read/write memory area .WORD 0x12345678 @ 32-bit hex constant .BYTE 0x78, 0x56, 0x34, 0x12 @ same constant (little-endian order) .ASCIZ "String" @ zero-terminated character byte sequence .DOUBLE 3.14 @ 2.23e-308 to 1.79e+308, 16 decimal digits .TEXT @ resume read-only code memory .ALIGN 2 @ instructions must be on 4-byte boundaries .END @ optional, end of program ARM and Thumb Instruction Set Quick Reference Card: http://infocenter.arm.com/help/topic/com.arm.doc.qrc0001l/QRC0001_UAL.pdf